Bootstrap

Linux 系统资源查看及硬件故障排查(qbit)

前言

  • 本文对 适用

系统资源查看

OS 版本

# 发行版
cat /etc/issue
lsb_release -a
# 内核版本
uname -a
cat /proc/version

系统资源概览

top
# OR
htop

主板信息

# 显示总线信息
sudo lshw -businfo
sudo lshw -short
# 查看主板型号
sudo lshw -c bus | grep Motherboard -C 7

CPU

# 查看CPU信息
lshw -c cpu # 推荐
lscpu
cat /proc/cpuinfo

# 查看CPU型号
cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c

# 总核数 = 物理CPU个数 X 每颗物理CPU的核数 
# 总逻辑CPU数 = 物理CPU个数 X 每颗物理CPU的核数 X 超线程数

# 查看物理CPU个数
cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l

# 查看每个物理CPU中core的个数(即核数)
cat /proc/cpuinfo| grep "cpu cores"| uniq

# 查看逻辑CPU的个数
cat /proc/cpuinfo| grep "processor"| wc -l

内存

free -h
cat /proc/meminfo
# 查看内存条信息
lshw -c memory

磁盘

  • 查看磁盘型号

sudo hdparm -i /dev/sda | grep "Model"
  • 随机读写 IOPS

fio -filename=/dev/sda -direct=1 -iodepth 1 -thread -rw=randrw -rwmixread=70 -ioengine=psync -bs=16k -size=15G -numjobs=20 -runtime=60 -group_reporting -name=mytest
  • 读写吞吐量

hdparm -Tt --direct /dev/sda
  • 磁盘转速

$ sudo hdparm -I /dev/sda | grep Rotation 
  Nominal Media Rotation Rate: 5400
$ sudo sg_vpd -a  /dev/sda | grep rpm 
  Nominal rotation rate: 5400 rpm
  • 分区与容量

# 磁盘容量及分区状况(不能查看未挂载分区)
df -Th

# 磁盘容量及分区状况(可以查看未挂载分区)
sudo fdisk -l
sudo lsblk -f

# 查看磁盘 UUID(可以查看未挂载分区)
sudo blkid

# /lib 目录大小
du -sh /lib
 
# /lib 子目录大小
du -sh /lib/*
  • 查看磁盘 io

iotop

# d 查看磁盘读写状况
# m 以 MB 为单位
# 2 每 2s 统计一次
# 3 一共统计 3 次
iostat -dm 2 3

硬件故障排查

磁盘状况

  • 快速检查磁盘是否健康

$ sudo smartctl -H /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-89-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
  • 查看磁盘身份信息

$ sudo smartctl -i /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-89-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-1ER164
Serial Number:    W4Z40M4R
LU WWN Device Id: 5 000c50 09d433979
Firmware Version: CC26
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Nov  3 15:42:35 2021 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
  • 查看设备属性(温度、读写次数、使用时间等)

$ sudo smartctl -A /dev/nvme1n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-89-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF SMART DATA SECTION ===
SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        33 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    2,905,845 [1.48 TB]
Data Units Written:                 37,741,109 [19.3 TB]
Host Read Commands:                 24,704,077
Host Write Commands:                171,328,846
Controller Busy Time:               991
Power Cycles:                       32
Power On Hours:                     5,525
Unsafe Shutdowns:                   21
Media and Data Integrity Errors:    0
Error Information Log Entries:      1
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

日志查看

内核日志

  • 内核日志位置:

  • 可以直接打开文件查看,推荐用 命令查看

系统日志

  • 系统日志位置:

journald 日志

  • journald 日志文件位置:

  • 不能直接打开查看,推荐使用 命令查看

  • 示例

journalctl --dmesg
# 等价于
dmesg -T
  • 着色

journalctl | ccze -A

本文出自