The top 6 common debugging methods in Linux.
( 1 ) Increase logging verbosity
As usual, in Linux systems we could look out for errors/warnings/info related messages in "/var/log/messages" file. By default only ‘info’ ( information) and above messages gets logged in to this file, so to increase this logging verbosity one could edit the configuration line of "/var/log/messages" in "/etc/rsyslog.conf" file.
Change this line from :
*.info;mail.none;authpriv.none;cron.none /var/log/messages
to
*.debug;mail.none;authpriv.none;cron.none /var/log/messages
After making this change, save and exit the file. Restart the “rsyslogd” service ( 'systemctl restart rsyslog.service' or 'service rsyslog restart' in case of RHEL6.x and older) to make the changes come into effect. NOTE: Keep in mind that this would generate a lot of logs and "/var" may get filled up fast, so after the diagnosis make sure to revert the changes.
One could also view the log messages as it gets captured using the "tail -f /var/log/message" command. Increasing logging verbosity of application based logs also possible, but depends on whether such option is supported by the corresponding application. Also, applications logs may get stored separately (but usually under /var folder but with different name) in most cases and those are set/defined as per application configurations.
( 2 ) Use 'verbose' or 'debug' option if available while running any commands.
Many of the commands do support the usage of 'verbose' parameter (-v or --verbose) which would facilitate in getting debug text output while running commands. This is helpful especially when a command is not generating expected output or when a command fails or generates error. In order to understand if a command supports verbose option one has to refer the man page (help page).
As it is known to most Linux administrator, the verbose option is used widely when processing commands such as 'ssh'. Whenever there is a connectivity issues, we could run 'ssh' in verbose mode as shown below:
# ssh -v UserName@HostName
These verbose parameter could be added multiple times to increase logging verbosity which would aid in troubleshooting. Like-wise, many commands do support the usage of verbose parameter.
( 3 ) Check if port is blocked in iptables or firewalld.
If there is any problem in accessing an application or web URL or if you sense there is a problem in accessing port, please check if that port is allowed via iptables/firewalld (operating system firewall). Many a times we come across this situation wherein application port is up (normally detected using 'netstat -tunlp' or 'ss -tunlp' commands) on server, however, client systems are unable to connect.
In such situations one could bring down the iptables or firewalld temporarily and test connections, if it works then it is certain that it is because of firewall blocking, hence, could add required ports to be allowed either in iptables or firewalld. NOTE: Take necessary precautions before bringing down a system firewall as it may violate enterprise/corporate security policy. In such case one has to depend on using 'telnet' or 'netcat' (nc) to check if port is blocked/allowed. For more details on using 'telnet' or 'nc' command, refer this blog page under the section 'What are the alternative steps that could be used to test a remote server alive status when ping check fails (if blocked by iptables)?':
In RHEL6.x and below, one could perform the below steps to check if port is being blocked by iptables:
→ First, save/backup current/active iptables rules.
# iptables-save > /tmp/iptables-rules.out
→ Flush current iptables rules.
# iptables -F
→ Stop iptables service and confirm no rules are active now
# service iptables stop
# iptables -L
→ Now, check if client could connect to the application or respective port successfully. If so, then need to allow respective port via iptables. To do this, first we'd need to start iptables service. So, let's enable/start iptables and restore the rules now.
→ Start the iptables service and check on applicable firewall rules.
# service iptables start
# iptables -L
→ If there is any discrepancy in rules then one could restore rules from backup file, save rules and restart the service as shown below:
# iptables-restore < /tmp/iptables-rules.out
# service iptables save
# service iptables restart
# iptables -nvL --line-numbers # to view list of current active rules along with line numbers
→ Add/allow respective port via iptables (Example: allow port 2049/TCP via iptables):
# iptables -I INPUT -m tcp -p tcp --dport 2049 -j ACCEPT
→ Save and restart iptables to make this permanent:
# service iptables save
# service iptables restart/reload
In RHEL7.x and above:
→ Stop firewalld service
# systemctl stop firewalld.service
→ Check for any active rules which should not show any active/current firewalld rules.
# firewall-cmd --list-all
OR
# iptables -L
→ Test application connectivity from client end and if that works then it is confirmed that it is because of port blocking. Start the firewalld service and add required rules.
# systemctl start firewalld.service
Please visit this blog thread to find out how add/modify/remove rules in firewalld:
( 4 ) Check is SELinux context tags are properly set.
→ Check if SELinux is enabled. One could run the command "getenforce" to understand if SELinux is in 'Enforcing' or 'Permissive' mode. Otherwise, this would return the answer as 'disabled'.
→ If the above command returns 'Enforcing' then this could be the issue. Therefore, proper SELinux context tags need to be set as required. The SELinux mode could be changed on the fly from 'Enforcing' to 'Permissive' to check if this is the cause by running the command:
# setenforce 0
→ Verify if SELinux mode has got changed (the below command should return 'Permissive'):
# getenforce
→ In this mode, SELinux would not block any access, however, would log a warning message and would allow connections. Once this is done, test application connections and if that works then need to revert SELinux mode to 'Enforcing' and set proper SELinux context tags and ports.
→ Let's say that our webserver is configured with different 'DocumentRoot' directory which is pointing to '/myweb' instead of default '/var/www/html'. In this case the web access would fail since the SELinux context tag is different. We could also get to see an error in the respective application log file ( in this case: /var/log/html/error.log).
→ Verify the current SELinux context tag. The syntax to be used is : # ls -lZd /<DirectoryName>
# ls -lZd /myweb
[root@rhel7 ~]# ls -ldZ /myweb drwxr-xr-x. root root unconfined_u:object_r:default_t:s0 /myweb
[root@rhel7 ~]# ls -ldZ /myweb drwxr-xr-x. root root unconfined_u:object_r:default_t:s0 /myweb
→ The above command tells me that the SELinux context tag set is different when compared with the one set on '/var/www/html' directory which is 'httpd_sys_content_t' as given below:
[root@rhel7 ~]# ls -ldZ /var/www/html drwxr-xr-x. root root system_u:object_r:httpd_sys_content_t:s0 /var/www/html
→ Hence, need to fix this by using 'semanage' command (This command is available from 'policycoreutils-python’ package).
[root@rhel7 ~]# ls -ldZ /var/www/html drwxr-xr-x. root root system_u:object_r:httpd_sys_content_t:s0 /var/www/html
# semange fcontext -a -t httpd_sys_content_t '/myweb(/.*)?'
# restorecon -FRvv /myweb
→ Now, verify if correct SELinux context tag is applied on the web server root directory:
[root@rhel7 ~]# ls -ldZ /myweb drwxr-xr-x. root root system_u:object_r:httpd_sys_content_t:s0 /myweb
( 5 ) Use 'strace' or 'ltrace' while running any commands whenever required.
One could use 'strace' to trace system calls and different threads which gets executed as a command is being run. This is helpful in understanding and tracing how a command gets processed. This is a handy option when a command is not returning expected output and if we wish to debug further.
So, to understand what are the different system calls gets called in when running any command, we could simply run the command by prefixing it with 'strace' as shown below:
So, to understand what are the different system calls gets called in when running any command, we could simply run the command by prefixing it with 'strace' as shown below:
# strace ls
# strace -o /tmp/ls-strace.out ls
( 6 ) Switch from GUI to CLI if required.
…...in RHEL7 and above versions:
To understand the number of different system calls being processed:
# strace -c ls
One could check out the man page of the command to understand different arguments/options available.
Also, there is 'ltrace' command which is available and not usually installed by default. One could install the 'ltrace' package to get this. This would help in understanding the different library threads which gets executed when a command runs. Run 'ltrace' command by prefixing it to any commands as shown below:
# ltrace ls
# ltrace -c ls
…. this would fetch a detailed report of number of library call counts.
Rather than showing/dumping all such library traces on to a terminal which is hard to read and lengthy, one could get the output to a temporary file and read it later as shown below:
# ltrace -o /tmp/ls-ltrace.out ls
So, in this case the file /tmp/ls-ltrace.out could be read or viewed later for better readability.
( 6 ) Switch from GUI to CLI if required.
Yes, sometimes it is necessary to switch from GUI to CLI mode which could cut down the graphical process/threads and allow us to get into text mode. This would also helps the system to slow down on resource consumption. Many a times the graphical processes or threads or services which are running in the background are not necessary. It is recommended or best method to setup/install a system in non-gui mode whenever possible.
So, to switch from GUI mode to CLI, we could run the below command:
…..in RHEL6 and below versions:
# init 3
…...in RHEL7 and above versions:
# systemctl isolate multi-user.target
No comments:
Post a Comment