UNIX/Linux

Command lines

Shell

帳號的shell設定都是寫在 /etc/passwd

看目前的shell

echo $0
# or
ps -p $$
#or
echo $SHELL

列出系統的shells

cat /etc/shells

設定預設的shell

chsh -s /bin/zsh

do something 2>&1

file descriptor:

0: stdin
1: stdout
2: stderr

2>&1就是把stderr redirect 到 stdout

Editing file

delete some columns in csv

cut -d, -f [range]

examples:

test.csv

A,B,C,D,E,F,G,H

cut -d, -f 1 < test.csv # A
cut -d, -f 1-5 < test.csv # A,B,C,D,E
cut -d, -f 1-3,6- < test.csv # A,B,C,F,G,H

append , to each file in csv

sed 's/$/,/' input_file > output_file

找東西

find in file (grep-like):

find file size < 5k

find . -size -5k -type f

```bash titil="find files create in 5mins" find . -cmin -5

```bash title="find file named: foo and modified in 5 days"
find . -name foo -mtime 5

find all folders and delete OS thumb files

find . -name ".DS_Store" -delete
find . -type f -name '._*' -delete
find . -type f -not -name '._*' -size -5k

count number of files

find [path] -type f | wc -l

batch change extension to lower case (.JPG → .jpg)

for f in *.JPG; do mv "$f" “${f%.JPG}.jpg”; done

找出一年前的檔案，並計算出大小

find . -type f -mtime +365 -print0 | du --files0-from=- -ch | grep total

grep

o, --only-matching: 只印出找到的字串，不會印整行(這個在minified檔案裡很煩)
P, --perl-regexp: 可以用regex
l, 只有引檔名
A 3, 後3行
B 3, 前3行
C 3, 前後3行

count, sum

count files

ls -l | wc -l
find . -type f |wc -l

count files per directory

du -a | cut -d/ -f2 | sort | uniq -c | sort -nr

via: https://stackoverflow.com/a/39622947/644070

du -a: list all files include directory ( 416 ./images/path/to => 416 is size)
cut -d/ -f2: remove sections from each line of files, -d: delimiter, -f: select only these fields (把下一層目錄，這邊是image 取出)
uniq -c: -c: count
sort -nr: -n: numeric-sort, -r: reverse

find file numbers in each folder

find . -type f | cut -d/ -f2 | sort | uniq -c
du -a | cut -d/ -f2 | sort | uniq -c | sort -nr

via: https://stackoverflow.com/a/54305758/644070

count top20 big folders

sudo du -a /dir/ | sort -n -r | head -n 20

count size each dir

du -sh *

count folder layer 1 sum size

# du -d 1 |sort -nr | cut -f2- | xargs du -hs # mac
du --max-depth=1 |sort -nr | cut -f2- | xargs du -hs

count number of lines in a file

cat some_file.txt | wl -c

找出某種目錄名稱，統計總共多大

find /path/to/search -type d -name "" -print0 | xargs -0 du -sb | awk '{sum += $1} END {print sum}'

via: chatgpt

print0 prints the full directory name on the standard output, followed by a null character (instead of a newline). This is useful to handle directory names with spaces.
xargs -0 takes the null-separated output from find and passes it to du.

process data (awk, sed, grep, tr)

print certain line in a file

sed -n 6489p some_fine # print line number: 6489

insert text to certain line

sed -i '1 i\foo' path/to/file # insert "foo" string to head of the file
sed -i '2 i\bar' path/to/file # insert "foo" string to line 2

get url and download file in structured text file

cat foo.csv | awk -F, '{print "curl -O https://url.to/"$3".jpg"}' | bash

my-mp3-list.txt

#3-Let It Go,https://www.youtube.com/watch?v=L0MK7qz13bU
awk -F, '{print "youtube-dl " $2 " -x --audio-format mp3 -o " $1 ".mp3"}' my-mp3-list.txt > mp3-cmds.sh
./mp3-cmds.sh

get certain attribute value in HTML tag

# single quote
grep -rho 'data-ga-category="[^"]*"' ./path/to | sed 's/data-ga-category="//' | sed 's/"$//' | uniq > ga-category.txt
# double quote
grep -rho 'data-ga-category="[^"]*"' ./path/to | sed 's/data-ga-category="//' | sed 's/"$//' | uniq >> ga-category.txt

replace line-break (\n) to ,

tr '\n' ',' < source-file

delete character

tr -d `\n` < source-file

delete some columns in csv

cut -d, -f [range]

examples:

test.csv

A,B,C,D,E,F,G,H

cut -d, -f 1 < test.csv # A
cut -d, -f 1-5 < test.csv # A,B,C,D,E
cut -d, -f 1-3,6- < test.csv # A,B,C,F,G,H

append , to each file in csv

sed 's/$/,/' input_file > output_file

拿到Excel資料是1個欄位，很多列(Row)，要變成1列很多欄(Column)的形狀

split big file to chunk (1000 line per file), prefix will gose to prefixaa, prefixab, prefixac...

split

split -l 1000 file-to-be-split prefix.

Process modern data (jq for JSON)

jq extract json string data, ex:

cat foo.json | jq '.s["894"]' # get {"s": {"894": "200"}}
200

File Coding

check file MIME coding

file -ib {foo}

- -b brief output (ignore file name) - -i include MIME-type information but maybe wrong, ex: file indicated "ISO-8859-1", but it's cp950.

iconv

iconv -f cp950 -t UTF-8 {input-file} > {output-file}

Networks

telnet & openssl:

連線測試

telnet my-host.com 80
GET / HTTP/1.1
Host: my-host.com
Connection: close (自動關閉連線)

https連線測試

openssl s_client -connect my-host.com:443
GET / HTTP/1.1
Host: my-host.com
Connection: close (自動關閉連線)

connect by samba

smbclient //some-ip/folder -U my-username

smb: \> ls

mount windows

sudo mount -t cifs //some-ip/path my-local/ -o username=my-username,password=my-password

smbmount or smbfs seems deprecated, use cifs instead (Debian package: cifs-utils)

很多網路指令都有新的選擇(ifconfig, netstat, route, arp) => ip nixCraft - Ugh 😤 ip command is the most significant change in... | Facebook

System Admin

su my-user 跟 su - my-user 的差別，多了一個 - 就是會把該user的環境變數跟shell配置代入，如果沒有-的話，就是沿用目前用戶的環境變數。

設定語言

sudo dpkg-reconfigure locales

test HD Performance, use the dd command to measure server throughput (write speed):

dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=dsync

Linux and Unix Test Disk I/O Performance With dd Command - nixCraft

htop explained | peteris.rocks

make bootable linux distro

dd if=/dev/sdb of=./some-linux-distro-version.iso bs=1M

Device

check usb device

check usb version (speed)

lsusb [-t]
# or
cat /sys/bus/usb/devices/usb{2}/speed

check CPU

lscpu

Multimedia

conver mp3 (ffmpeg)

find *.wav -exec ffmpeg -i {} {}.mp3 \;

Other examples

check AWS S3 key exists by a key list in a file (keys.txt) and output not exist results

#!/bin/bash
output_file="nonexistent_keys.txt"

while IFS= read -r key; do
    if aws s3 ls "s3://my-bucket/path/to/$key".jpg > /dev/null 2>&1; then
        echo "$key exists"
    else
        echo "$key not exists"
        echo "$key" >> "$output_file"
    fi
done < keys.txt
echo "done"