uniq 命令：报告或省略重复行

1. 命令简介

uniq 命令用于检测和删除文件中的重复行，或者报告重复的行。它通常与 sort 命令一起使用，因为 uniq 只检测相邻的重复行。

2. 基本语法

bash

uniq [选项] [输入文件 [输出文件]]

3. 常用选项

-c：在每行前面加上重复次数
-d：只输出重复的行
-u：只输出不重复的行
-i：忽略大小写
-f n：忽略前 n 个字段
-s n：忽略前 n 个字符

4. 基础使用示例

删除重复行：
bash
```
sort file.txt | uniq
```
计算重复行数：
bash
```
sort file.txt | uniq -c
```
只显示重复的行：
bash
```
sort file.txt | uniq -d
```
只显示不重复的行：
bash
```
sort file.txt | uniq -u
```

5. 进阶使用技巧

忽略大小写：
bash
```
sort file.txt | uniq -i
```
忽略每行的前N个字段：
bash
```
sort file.txt | uniq -f 2
```
忽略每行的前N个字符：
bash
```
sort file.txt | uniq -s 5
```
组合使用多个选项：
bash
```
sort file.txt | uniq -c -i
```

6. 实用示例

找出最常出现的行：

bash

sort file.txt | uniq -c | sort -nr | head

统计单词频率：

bash

cat file.txt | tr ' ' '\n' | sort | uniq -c | sort -nr

检查日志文件中的唯一 IP 地址：

bash

awk '{print $1}' access.log | sort | uniq

比较两个文件的不同行：
bash
```
sort file1.txt file2.txt | uniq -u
```

7. 注意事项

uniq 命令默认只对相邻的行进行操作，所以通常需要先使用 sort 命令。
使用 -c 选项时，输出的计数位于行的开头。
uniq 命令在处理大文件时可能会消耗大量内存。

8. 相关命令

sort：排序文本行
comm：比较两个已排序的文件
awk：更强大的文本处理工具
sed：流编辑器

9. 技巧与建议

使用 uniq 和 sort 的组合来找出独特的行：
bash
```
sort file.txt | uniq > unique_lines.txt
```

使用进程替换来比较两个文件的唯一行：

bash

comm -3 <(sort file1.txt | uniq) <(sort file2.txt | uniq)

在管道中使用 uniq：
bash
```
cat file.txt | sort | uniq | wc -l
```
结合 cut 命令使用来处理特定字段：
bash
```
cut -f2 file.txt | sort | uniq -c
```

uniq 命令虽然功能单一，但在文本处理和数据分析中非常有用。它通常与其他命令如 sort、cut、awk 等结合使用，可以高效地处理重复数据、进行数据去重、统计频率等操作。掌握 uniq 命令及其常见用法，可以在处理日志文件、数据清理和文本分析等任务中大大提高效率。