Linux 系统资源查看及硬件故障排查(qbit)
前言
本文对 适用
系统资源查看
OS 版本
# 发行版
cat /etc/issue
lsb_release -a
# 内核版本
uname -a
cat /proc/version
系统资源概览
top
# OR
htop
主板信息
# 显示总线信息
sudo lshw -businfo
sudo lshw -short
# 查看主板型号
sudo lshw -c bus | grep Motherboard -C 7
CPU
# 查看CPU信息
lshw -c cpu # 推荐
lscpu
cat /proc/cpuinfo
# 查看CPU型号
cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c
# 总核数 = 物理CPU个数 X 每颗物理CPU的核数
# 总逻辑CPU数 = 物理CPU个数 X 每颗物理CPU的核数 X 超线程数
# 查看物理CPU个数
cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l
# 查看每个物理CPU中core的个数(即核数)
cat /proc/cpuinfo| grep "cpu cores"| uniq
# 查看逻辑CPU的个数
cat /proc/cpuinfo| grep "processor"| wc -l
内存
free -h
cat /proc/meminfo
# 查看内存条信息
lshw -c memory
磁盘
查看磁盘型号
sudo hdparm -i /dev/sda | grep "Model"
随机读写 IOPS
fio -filename=/dev/sda -direct=1 -iodepth 1 -thread -rw=randrw -rwmixread=70 -ioengine=psync -bs=16k -size=15G -numjobs=20 -runtime=60 -group_reporting -name=mytest
读写吞吐量
hdparm -Tt --direct /dev/sda
磁盘转速
$ sudo hdparm -I /dev/sda | grep Rotation
Nominal Media Rotation Rate: 5400
$ sudo sg_vpd -a /dev/sda | grep rpm
Nominal rotation rate: 5400 rpm
分区与容量
# 磁盘容量及分区状况(不能查看未挂载分区)
df -Th
# 磁盘容量及分区状况(可以查看未挂载分区)
sudo fdisk -l
sudo lsblk -f
# 查看磁盘 UUID(可以查看未挂载分区)
sudo blkid
# /lib 目录大小
du -sh /lib
# /lib 子目录大小
du -sh /lib/*
查看磁盘 io
iotop
# d 查看磁盘读写状况
# m 以 MB 为单位
# 2 每 2s 统计一次
# 3 一共统计 3 次
iostat -dm 2 3
硬件故障排查
磁盘状况
快速检查磁盘是否健康
$ sudo smartctl -H /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-89-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
查看磁盘身份信息
$ sudo smartctl -i /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-89-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST2000DM001-1ER164
Serial Number: W4Z40M4R
LU WWN Device Id: 5 000c50 09d433979
Firmware Version: CC26
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Nov 3 15:42:35 2021 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
查看设备属性(温度、读写次数、使用时间等)
$ sudo smartctl -A /dev/nvme1n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-89-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF SMART DATA SECTION ===
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 33 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 2,905,845 [1.48 TB]
Data Units Written: 37,741,109 [19.3 TB]
Host Read Commands: 24,704,077
Host Write Commands: 171,328,846
Controller Busy Time: 991
Power Cycles: 32
Power On Hours: 5,525
Unsafe Shutdowns: 21
Media and Data Integrity Errors: 0
Error Information Log Entries: 1
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
日志查看
内核日志
内核日志位置:
可以直接打开文件查看,推荐用 命令查看
系统日志
系统日志位置:
journald 日志
journald 日志文件位置:
不能直接打开查看,推荐使用 命令查看
示例
journalctl --dmesg
# 等价于
dmesg -T
着色
journalctl | ccze -A