Our Blog

Ongoing observations by End Point people

Shell Command Outputs Truncated in Python

By Selvakumar Arumugam
April 5, 2018

Two guys working at computers
Photo by Sarah Pflug of Burst

Recently I was working on a Python script to do some parsing and processing on the output of shell commands in Ubuntu. The output that showed up was truncated.

The below sections will walk through the debugging process to identify the root cause and implement a solution with detailed explanation, using Python 2.

Problem

The following code block shows the output of a shell command which lists the installed packages, name and version, in Ubuntu.

# dpkg -l | grep ^ii | awk '{print $2 "    " $3}'
accountsservice    0.6.35-0ubuntu7.3
acl    2.2.52-1
adduser    3.113+nmu3ubuntu3
ant    1.9.3-2build1
ant-optional    1.9.3-2build1
apache2    2.4.7-1ubuntu4.18
apache2-bin    2.4.7-1ubuntu4.18
apache2-data    2.4.7-1ubuntu4.18
apache2-utils    2.4.7-1ubuntu4.18
apparmor    2.10.95-0ubuntu2.6~14.04.1

The same shell command executes in the Python console but the output shows truncated values for a few packages’ versions, for example, accountsservice, adduser, apache2, etc.

>>> import subprocess
>>> installed_packages = subprocess.check_output(['dpkg -l | grep ^ii | awk \'{print $2 "    " $3}\''], shell=True)
>>> print installed_packages
accountsservice    0.6.35-0ubuntu7.
acl    2.2.52-1
adduser    3.113+nmu3ubuntu
ant    1.9.3-2build1
ant-optional    1.9.3-2build1
apache2    2.4.7-1ubuntu4.1
apache2-bin    2.4.7-1ubuntu4.1
apache2-data    2.4.7-1ubuntu4.1
apache2-utils    2.4.7-1ubuntu4.1
apparmor    2.10.95-0ubuntu2

Root Cause

To identify the root cause of the problem, I started with source command dpkg -l command without any filters and processing. I have noticed two different results for this command, with and without less command. The less command showed the complete result with scrolling as below.

# dpkg -l | less
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                  Version                                    Architecture Description
+++-=====================================-==========================================-============-===============================================================================
rc  aacraid                               1.2.1-52011                                amd64        This driver supports Adaptec by PMC aacraid family of cards.
ii  accountsservice                       0.6.35-0ubuntu7.3                          amd64        query and manipulate user account information
ii  acl                                   2.2.52-1                                   amd64        Access control list utilities
ii  adduser                               3.113+nmu3ubuntu3                          all          add and remove users and groups
ii  ant                                   1.9.3-2build1                              all          Java based build tool like make
ii  ant-optional                          1.9.3-2build1                              all          Java based build tool like make - optional libraries
ii  apache2                               2.4.7-1ubuntu4.18                          amd64        Apache HTTP Server
ii  apache2-bin                           2.4.7-1ubuntu4.18                          amd64        Apache HTTP Server (binary files and modules)
ii  apache2-data                          2.4.7-1ubuntu4.18                          all          Apache HTTP Server (common files)
ii  apache2-utils                         2.4.7-1ubuntu4.18                          amd64        Apache HTTP Server (utility programs for web servers)
rc  apache2.2-common                      2.2.22-1ubuntu1.11                         amd64        Apache HTTP Server common files
ii  apparmor                              2.10.95-0ubuntu2.6~14.04.1                 amd64        user-space parser utility for AppArmor

But dpkg -l prints on the screen with truncated data due to the columns width constraint. The truncated values exactly match the Python console output. The output column width is decided by environment variable COLUMNS and commands restrict the column width in output based on COLUMNS value.

# echo $COLUMNS
127

# dpkg -l
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                   Version          Architecture     Description
+++-======================-================-================-==================================================
rc  aacraid                1.2.1-52011      amd64            This driver supports Adaptec by PMC aacraid family
ii  accountsservice        0.6.35-0ubuntu7. amd64            query and manipulate user account information
ii  acl                    2.2.52-1         amd64            Access control list utilities
ii  adduser                3.113+nmu3ubuntu all              add and remove users and groups
ii  ant                    1.9.3-2build1    all              Java based build tool like make
ii  ant-optional           1.9.3-2build1    all              Java based build tool like make - optional librari
ii  apache2                2.4.7-1ubuntu4.1 amd64            Apache HTTP Server
ii  apache2-bin            2.4.7-1ubuntu4.1 amd64            Apache HTTP Server (binary files and modules)
ii  apache2-data           2.4.7-1ubuntu4.1 all              Apache HTTP Server (common files)
ii  apache2-utils          2.4.7-1ubuntu4.1 amd64            Apache HTTP Server (utility programs for web serve
rc  apache2.2-common       2.2.22-1ubuntu1. amd64            Apache HTTP Server common files
ii  apparmor               2.10.95-0ubuntu2 amd64            user-space parser utility for AppArmor

Solution

The subprocess module of Python provides complete untruncated output of the shell command when the argument env={} is passed to check_output function:

>>> installed_packages = subprocess.check_output(['dpkg -l | grep ^ii | awk \'{print $2 "    " $3}\''], shell=True, env={})
>>> print installed_packages
accountsservice    0.6.35-0ubuntu7.3
acl    2.2.52-1
adduser    3.113+nmu3ubuntu3
ant    1.9.3-2build1
ant-optional    1.9.3-2build1
apache2    2.4.7-1ubuntu4.18
apache2-bin    2.4.7-1ubuntu4.18
apache2-data    2.4.7-1ubuntu4.18
apache2-utils    2.4.7-1ubuntu4.18
apparmor    2.10.95-0ubuntu2.6~14.04.1

Explanation

Curious to know what is happening behind the scenes? The check_output function uses C library functions execv or execve for processing. It chooses the function based on the env argument.

Reference:

When no env argument is passed to subprocess.check_output, the os.execv function is called.

When an env argument is passed to subprocess.check_output, the os.execve function is called.

for (i = 0; exec_array[i] != NULL; ++i) {
    const char *executable = exec_array[i];
    if (envp) {
        execve(executable, argv, envp);
    } else {
        execv(executable, argv);
    }

What makes the execv and execve functions produce different output?

The execv function passes through the shell COLUMNS variable which leads to truncating output columns to 127 width, like our reference system.

# echo $COLUMNS
127

>>> print subprocess.check_output(['dpkg -l | grep libqtcore4'], shell=True)
ii  libqtcore4:amd64          4:4.8.5+git192-g0 amd64             Qt 4 core module

>>> print subprocess.check_output(['dpkg -l | grep libqtcore4'], shell=True, env={'COLUMNS':'127'})
ii  libqtcore4:amd64          4:4.8.5+git192-g0 amd64             Qt 4 core module

The execve function uses additional argument environment variables and it is based on the environ function. It uses environment variables available in env command which doesn't have COLUMNS initialised. So output values returned without any column width restriction.

>>> print subprocess.check_output(['dpkg -l | grep libqtcore4'], shell=True, env={})
ii  libqtcore4:amd64                      4:4.8.5+git192-g085f851+dfsg-2ubuntu4.1    amd64        Qt 4 core module

>>> print subprocess.check_output(['dpkg -l | grep libqtcore4'], shell=True, env={'COLUMNS':''})
ii  libqtcore4:amd64                      4:4.8.5+git192-g085f851+dfsg-2ubuntu4.1    amd64        Qt 4 core module

For more details refer to the man pages of execv, execve, environ.

Conclusion

It is always good to pass env={} argument to subprocess.check_output function whenever processing shell command output in Python. It helps avoid unstable results down the line due to truncated values.

python shell environment


Comments

Popular Tags


Archive


Search our blog