This article aims to give a clear representation of the relation between filenames and inodes, to explain what is an inode and which differences exist between a hard link and a symbolic link.
2.1 Filename as Hardlink to Inode

On ext3 and ext4 (and some other filesystems) a file is stored internally as an inode and a filename is just a pointer to that inode, called a hardlink.
The bytes stored in a file are called the data itself, while the file metadata represents the filesystem information about that file like timestamps, ownerships and permissions (and other low-level properties like the allocation table of blocks). The inode contains the data as well as the metadata. The command ‘stat’ shows you some of the metadata of a file:
echo "1234" >file1 #create a 4 Bytes file cp -v file1 file2 #copy the file to a new inode `file1' -> `file2' cp -lv file1 file1h #create a new hardlink to the file `file1' -> `file1h' cp -sv file1 file1s #create a symlink to the file `file1' -> `file1s' stat file* #show information about created files File: `file1' Size: 5 Blocks: 8 IO Block: 4096 regular file Device: 802h/2050d Inode: 18129 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-07-19 18:29:57.000000000 +0200 Modify: 2013-07-19 18:29:57.000000000 +0200 Change: 2013-07-19 18:29:57.000000000 +0200 Birth: - File: `file1h' Size: 5 Blocks: 8 IO Block: 4096 regular file Device: 802h/2050d Inode: 18129 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-07-19 18:29:57.000000000 +0200 Modify: 2013-07-19 18:29:57.000000000 +0200 Change: 2013-07-19 18:29:57.000000000 +0200 Birth: - File: `file1s' -> `file1' Size: 5 Blocks: 0 IO Block: 4096 symbolic link Device: 802h/2050d Inode: 18131 Links: 1 Access: (0777/lrwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-07-19 18:29:57.000000000 +0200 Modify: 2013-07-19 18:29:57.000000000 +0200 Change: 2013-07-19 18:29:57.000000000 +0200 Birth: - File: `file2' Size: 5 Blocks: 8 IO Block: 4096 regular file Device: 802h/2050d Inode: 18130 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-07-19 18:29:57.000000000 +0200 Modify: 2013-07-19 18:29:57.000000000 +0200 Change: 2013-07-19 18:29:57.000000000 +0200 Birth: -
As you can see, ‘cp -lv file1 file1h’ is just creating a new hardlink on the same inode 18129, file1 and file1h are two filenames but actually a single file with a single set of timestamps, ownerships and permissions (cf. File Ownerships, Permissions and Timestamps).
‘cp -v file1 file2′ on the other side is creating a hardlink to another inode and is then duplicating all data blocks, i.e. a new file.
And ‘cp -sv file1 file1s’ is creating a symlink pointing to the ‘file1′ filename.
To detect inodes with multiple hardlinks you can use for example ‘fdupes -Hr /bin’:
#find duplicate hardlinks (with an 'awk' command to keep only duplicate lines):
fdupes -Hrq /bin/ | xargs -l ls -i | awk '{L[$1][++I[$1]]=$0};END{for(i in I)if(I[i]>1)for(l in L[i])print L[i][l]}'
233900 /bin/bzip2
233900 /bin/bunzip2
233900 /bin/bzcat
229391 /bin/uncompress
229391 /bin/gunzip
229394 /bin/nisdomainname
229394 /bin/ypdomainname
229394 /bin/dnsdomainname
229394 /bin/domainname
You can also use ‘find’:
find /bin/ -inum $(ls -i /bin/bzip2 | awk '{print $1}')
/bin/bzip2
/bin/bunzip2
/bin/bzcat
find /bin/ -samefile /bin/bzip2
/bin/bzip2
/bin/bunzip2
/bin/bzcat
#find duplicate hardlinks:
find /bin/ -xdev -ls 2>/dev/null | awk '{
i = $1; sub(/[^/]*\.?\//, "./")
inum[i] = inum[i] ? inum[i] SUBSEP $0 : $0
}
END {
for (I in inum) {
if ((n = split(inum[I], files, SUBSEP)) > 1) {
print "hardlinks to inode",I":"
for (i = 1; i <= n; i++)
print files[i]
}
}
}'
hardlinks to inode 229394:
./bin/ypdomainname
./bin/dnsdomainname
./bin/domainname
./bin/nisdomainname
hardlinks to inode 233900:
./bin/bzip2
./bin/bunzip2
./bin/bzcat
hardlinks to inode 229391:
./bin/uncompress
./bin/gunzip
#find symlinks:
find /bin/ -type l -ls
70855 0 lrwxrwxrwx 1 root root 4 Dec 30 2012 /bin/rbash -> bash
229429 0 lrwxrwxrwx 1 root root 20 May 25 2012 /bin/mt -> /etc/alternatives/mt
234771 0 lrwxrwxrwx 1 root root 24 Jul 17 09:58 /bin/netcat -> /etc/alternatives/netcat
234837 0 lrwxrwxrwx 1 root root 6 Jul 17 13:50 /bin/bzcmp -> bzdiff
234984 0 lrwxrwxrwx 1 root root 6 Jul 17 13:56 /bin/open -> openvt
234253 0 lrwxrwxrwx 1 root root 4 Jul 17 13:58 /bin/lsmod -> kmod
229406 0 lrwxrwxrwx 1 root root 14 Jul 17 09:50 /bin/pidof -> /sbin/killall5
233711 0 lrwxrwxrwx 1 root root 8 Jun 10 2012 /bin/lessfile -> lesspipe
234843 0 lrwxrwxrwx 1 root root 6 Jul 17 13:50 /bin/bzless -> bzmore
234839 0 lrwxrwxrwx 1 root root 6 Jul 17 13:50 /bin/bzegrep -> bzgrep
558132 0 lrwxrwxrwx 1 root root 4 Jun 22 2012 /bin/rnano -> nano
234769 0 lrwxrwxrwx 1 root root 20 Jul 17 09:58 /bin/nc -> /etc/alternatives/nc
229379 0 lrwxrwxrwx 1 root root 4 Jul 17 09:50 /bin/sh -> dash
234841 0 lrwxrwxrwx 1 root root 6 Jul 17 13:50 /bin/bzfgrep -> bzgrep
2.2 Differences between hardlinks and symlinks

When you delete a file you actually remove the filename from the directory index and remove one link to the inode (reminder: a filename is a pointer/hardlink to an inode).
The inode is only marked as deleted when there is no hardlink left (and when all processes have closed their file descriptors, which count as links, too, see in /proc/$$/fd/).
When you modify the file it affects all filenames that are linked to it.
A hardlink is similar to a symlink (symbolic link, pointer to a filename with a relative or absolute path) but is completely transparent for the applications. For example moving, renaming, or deleting a file does not affect a hardlink pointing to its inode though it breaks a symlink pointing to its filename; several hardlinks to the same inode are indistinguishable. In the above example the new hardlink ‘file1h’ was identical to ‘file1′ and deleting ‘file1′ would not affect ‘file1h’; on the other side it would make the symlink ‘file1s’ invalid.
Though symlinks may cross filesystem, hardlinks cannot point to an inode outside of its filesystem; every filesystem has its own space of inode-id. (cf. http://linuxgazette.net/105/pitcher.html)
Only symlinks can point to directories.
Hardlinks can be very handy to create backups without using new inodes and disk space. (cf. HOWTO – LOCAL AND REMOTE SNAPSHOT BACKUP USING RSYNC WITH HARD LINKS)
Useful commands:
#show duplicates only if also present in folder duplic
fdupes -r folder1/ folder2/ duplic/ | egrep -B1 "^duplic/" | egrep -v "^(--|duplic/)" | while read i; do [ -n "$i" ] && ls -l "$i"; done
#remove one duplicate at the end of each set (remove 'echo' to do it)
fdupes -r folder/ | egrep -B1 "^$" | egrep -v "^(--|)$" | while read i; do [ -n "$i" ] && echo rm -v "$i"; done
#show duplicates sorted by size and number of occurrences:
fdupes -r1 folder/ >/tmp/fdupes.txt
cat /tmp/fdupes.txt | awk '{print length,$0}' | sort -n | cut -d" " -f2- | while read a; do echo $(du -cbs $a | tail -1; echo "$a" | sed -e 's% %\n%g' | wc -l; echo $a); done | sort -n | tail
#remove all occurrences of file1 and its duplicates (remove 'echo' to do it)
fgrep 'folder/path/to/file1' /tmp/fdupes.txt | while read -d " " a; do echo rm -f "$a"; done | less -S
#remove empty folders
find folder/ -depth -type d -empty -exec rmdir \{\} \; | less -S
#create a folder recursive list into a folder.list
(cd /path/to/folder/ && find . -type f -printf "%p %s %T+\n" | sort) >folder.list
#show recursive size of current folder
for i in *; do echo -n "$i"; find "$i" -xdev -type f -ls | awk 'BEGIN {sum=0}; {sum+=$7}; END {printf ("%.20g\n", sum)}'; done | sort -nk2 | column -t
#compare folders
sdiff -sdbB <(cd folder1/ && find . -type f -printf "%p %T+ %s\n" | sort) <(cd folder2 && find . -type f -printf "%p %T+ %s\n" | sort) | less -
