Usage ##### General ------- .. program:: pygount Run and specify the folder to analyze recursively, for example: .. code-block:: bash $ pygount ~/development/sometool If you omit the folder, the current folder of your shell is used as starting point. Apart from folders you can also specify single files and shell patterns (using ``?``, ``*`` and ranges like ``[a-z]``). Certain files and folders are automatically excluded from the analysis: * files starting with dot (``.``) or ending in tilda (``~``) * folders starting with dot (``.``) or named ``_svn``. .. option:: --folders-to-skip LIST .. option:: --names-to-skip LIST To specify alternative patterns, use ``--folders-to-skip`` and ``--names-to-skip``. Both take a comma separated list of patterns, see below on the pattern syntax. To for example also prevent folders starting with two underscores (``_``) from being analyzed, specify ``--folders-to-skip=[...],__*``. .. option:: --suffix LIST To limit the analysis on certain file types, you can specify a comma separated list of suffixes to take into account, for example ``--suffix=py,sql,xml``. .. option:: --out FILE By default the result of the analysis are written to the standard output. To redirect the output to a file, use for example ``--out=counts.txt``. To explicitly redirect to the standard output specify ``--out=STDOUT``. .. option:: --format FORMAT By default the result of the analysis are written to the standard output in a format similar to sloccount. To redirect the output to a file, use e.g. ``--out=counts.txt``. To change the format to an XML file similar to cloc, use ``--format=cloc-xml``. To just get a quick grasp of the languages used in a project and their respective importance use ``--format=summary`` which provides a language overview and a sum total. For example pygount's summary looks like this:: Language Files % Code % Comment % ---------------- ----- ------ ---- ------ ------- ------ Python 19 51.35 1924 72.99 322 86.10 reStructuredText 7 18.92 332 12.59 7 1.87 markdown 3 8.11 327 12.41 1 0.27 Batchfile 1 2.70 24 0.91 1 0.27 YAML 1 2.70 11 0.42 2 0.53 Makefile 1 2.70 9 0.34 7 1.87 INI 1 2.70 5 0.19 0 0.00 TOML 1 2.70 4 0.15 0 0.00 Text 3 8.11 0 0.00 34 9.09 ---------------- ----- ------ ---- ------ ------- ------ Sum total 37 2636 374 The summary output is designed for human readers and the column widths adjust to the data. For further processing the results of pygount, ``--format=json`` should be the easiest to deal with. For more information see :doc:`json`. Remote repositories ------------------- Additionally to local files, pygount can analyze remote git repositories: .. code-block:: bash $ pygount https://github.com/roskakori/pygount.git In the background, this creates a shallow clone of the repository in a temporary folder that after the analysis is is removed automatically. Therefore you need to have at read access to the repository. If you want to analyze a specific revision, specify it at the end of the URL: .. code-block:: bash $ pygount https://github.com/roskakori/pygount.git/v1.6.0 The remote URL supports the git standard protocols: git, HTTP/S and SSH. .. code-block:: bash $ pygount git@github.com:username/project.git You can specify multiple repositories, for example to include both the web application, command line client and docker container of the `Weblate `_ project: .. code-block:: bash $ pygount https://github.com/WeblateOrg/weblate.git https://github.com/WeblateOrg/wlc.git https://github.com/WeblateOrg/docker.git And you can even mix local files and remote repositories: .. code-block:: bash $ pygount ~/projects/some https://github.com/roskakori/pygount.git Patterns -------- Some command line arguments take patterns as values. By default, patterns are shell patterns using ``*``, ``?`` and ranges like ``[a-z]`` as placeholders. Depending on your platform, the are case sensitive (Unix) or not (Mac OS, Windows). If a pattern starts with ``[regex]`` you can specify a comma separated list of regular expressions instead using all the constructs supported by the `Python regular expression syntax `_. Regular expressions are case sensitive unless they include a ``(?i)`` flag. If the first actual pattern is ``[...]``, default patterns are included. Without it, defaults are ignored and only the pattern explicitly stated are taken into account. .. option:: --generated So for example to specify that generated code can also contain the German word "generiert" in a case insensitive way use ``--generated="[regex][...](?i).*generiert"``. .. _duplicates: Counting duplicates ------------------- .. option:: --duplicates By default pygount prevents multiple source files with exactly the same content to be counted again. For two files to be considered duplicates the following conditions must be met: #. Both files have the same size. #. Both files have the same `MD5 `_ hashcode. This allows for an efficient detection with a very small possibility for false positives. However it also prevents detection of files with only minor differences as duplicates. Examples are files that are identical except for additional white space, empty lines or different line endings. If you still want to count duplicates multiple times, specify :option:`--duplicates`. This will also result in a minor performance gain of the analysis. Source code encoding ---------------------- .. option:: --encoding ENCODING[;FALLBACK] When reading source code, pygount automatically detects the encoding. It uses a simple algorithm where it recognizes BOM, XML declarations such as: .. code-block:: xml and "magic" comments such as: .. code-block:: python # -*- coding: cp1252 -*- If the file does not have an appropriate heading, pygount attempts to read it using UTF-8. If this fails, it reads the file using a fallback encoding (by default CP1252) and ignores any encoding errors. You can change this behavior using the :option:`--encoding` option: * To keep the automatic analysis and use a different fallback encoding specify for example :option:`--encoding=automatic;iso-8859-15 <--encoding>`. * To use an automatic detection based on heuristic, use :option:`--encoding=chardet <--encoding>`. For this to work, the `chardet `_ package must be installed, * To use a specific encoding (for all files analyzed), use for example :option:`--encoding=iso-8859-15 <--encoding>`. Pseudo languages ---------------- If a source code is not counted, the number of lines is 0 and the language shown is a pseudo language indicating the reason: * ``__binary__`` - used for :ref:`binary`. * ``__duplicate__`` - the source code duplicate as described at the command line option :option:`--duplicates`. * ``__empty__`` - the source code is an empty file with a size of 0 bytes. * ``__error__`` - the source code could not be parsed e.g. due to an I/O error. * ``__generated__`` - the source code is generated according to the command line option :option:`--generated`. * ``__unknown__`` - pygments does not provide a lexer to parse the source code. Other information ----------------- .. option:: --verbose If :option:`--verbose` is specified, pygount logs detailed information about what it is doing. .. option:: --help To get a description of all the available command line options, run: .. code-block:: bash $ pygount --help .. option:: --version To get pygount's current version number, run: .. code-block:: bash $ pygount --version