API¶

Overview¶

Pygount provides a simple API to integrate in other tools. This however is currently still a work in progress and subject to change.

Here’s an example on how to analyze one of pygount’s own source codes:

>>> from pygount import SourceAnalysis
>>> SourceAnalysis.from_file("pygount/analysis.py", "pygount")
SourceAnalysis(path='pygount/analysis.py', language='Python', group='pygount', state=analyzed, code_count=509, documentation_count=141, empty_count=117, string_count=23)

Information about multiple source files can be summarize using ProjectSummary:

First, set up the summary:

>>> from pygount import ProjectSummary
>>> project_summary = ProjectSummary()

Next, find some files to analyze:

>>> from glob import glob
>>> source_paths = glob("pygount/*.py") + glob("*.md")
>>> source_paths
['pygount/command.py', 'pygount/analysis.py', 'pygount/write.py', 'pygount/__init__.py', 'pygount/xmldialect.py', 'pygount/summary.py', 'pygount/common.py', 'pygount/lexers.py', 'README.md', 'CONTRIBUTING.md', 'CHANGES.md']

Then analyze them:

>>> for source_path in source_paths:
...     source_analysis = SourceAnalysis.from_file(source_path, "pygount")
...     project_summary.add(source_analysis)

Finally, take a look at the information collected, for example by printing the values of ProjectSummary.language_to_language_summary_map

>>> for language_summary in project_summary.language_to_language_summary_map.values():
...   print(language_summary)
...
LanguageSummary(language='Python', file_count=8, code=1232, documentation=295, empty=331, string=84)
LanguageSummary(language='markdown', file_count=3, code=64, documentation=0, empty=29, string=14)

Reference¶

Pygount counts lines of source code using pygments lexers.

class pygount.DuplicatePool¶

A pool that collects information about potential duplicate files.

duplicate_path(source_path: str) → str | None¶

Path to a duplicate for source_path or None if no duplicate exists.

Internally information is stored to identify possible future duplicates of source_path.

exception pygount.Error¶: Error to indicate that something went wrong during a pygount run.

class pygount.LanguageSummary(language: str)¶

Summary of a source code counts from multiple files of the same language.

add(source_analysis: SourceAnalysis) → None¶: Add counts from source_analysis to total counts for this language.

property code_count: int¶: sum lines of code for this language

property code_percentage: float¶: percentage of lines containing code for this language across entire project

property documentation_count: int¶: sum lines of documentation for this language

property documentation_percentage: float¶: percentage of lines containing documentation for this language across entire project

property empty_count: int¶: sum empty lines for this language

property empty_percentage: float¶: percentage of empty lines for this language across entire project

property file_count: int¶: number of source code files for this language

property file_percentage: float¶: percentage of files in project

property is_pseudo_language: bool¶: True if the language is not a real programming language

property language: str¶: the language to be summarized

property line_count: int¶: sum count of all lines of any kind for this language

sort_key() → Hashable¶: sort key to sort multiple languages by importance

property source_count: int¶: sum number of source lines of code

property source_percentage: float¶: percentage of source lines for code for this language across entire project

property string_count: int¶: sum number of lines containing strings for this language

property string_percentage: float¶: percentage of lines containing strings for this language across entire project

exception pygount.OptionError(message, source=None)¶: Error to indicate that a value passed to a command line option must be fixed.

class pygount.ProjectSummary¶

Summary of source code counts for several languages and files.

add(source_analysis: SourceAnalysis) → None¶: Add counts from source_analysis to total counts.

property language_to_language_summary_map: Dict[str, LanguageSummary]¶: A map containing summarized counts for each language added with add() so far.

update_file_percentages() → None¶: Update percentages for all languages part of the project.

class pygount.SourceAnalysis(path: str, language: str, group: str, code: int, documentation: int, empty: int, string: int, state: SourceState, state_info: str | None = None)¶

Results from analyzing a source path.

Prefer the factory methods from_file() and from_state() to calling the constructor.

property code_count: int¶: number of lines containing code

property documentation_count: int¶: number of lines containing documentation (resp. comments)

property empty_count: int¶

number of empty lines, including lines containing only white space, white characters or white code words

See also: white_characters(), white_code_words()

static from_file(source_path: str, group: str, encoding: str = 'automatic', fallback_encoding: str = 'cp1252', generated_regexes=[re.compile('(?i).*automatically generated', re.IGNORECASE), re.compile('(?i).*do not edit', re.IGNORECASE), re.compile('(?i).*generated with the .+ utility', re.IGNORECASE), re.compile('(?i).*this is a generated file', re.IGNORECASE), re.compile('(?i).*generated automatically', re.IGNORECASE)], duplicate_pool: DuplicatePool | None = None, file_handle: IOBase | None = None) → SourceAnalysis¶

Factory method to create a SourceAnalysis by analyzing the source code in source_path or the open file file_handle.

Parameters:

source_path – path to source code to analyze
group – name of a logical group the sourc code belongs to, e.g. a package.
encoding – encoding according to encoding_for()
fallback_encoding – fallback encoding according to encoding_for()
generated_regexes – list of regular expression that if found within the first few lines if a source code identify is as generated source code for which SLOC should not be counted
duplicate_pool – a DuplicatePool where information about possible duplicates is collected, or None if possible duplicates should be counted multiple times.
file_handle – a file-like object, or None to read and open the file from source_path. If the file is open in text mode, it must be opened with the correct encoding.

static from_state(source_path: str, group: str, state: SourceState, state_info: str | None = None) → SourceAnalysis¶: Factory method to create a SourceAnalysis with all counts set to 0 and everything else according to the specified parameters.

property group: str¶

Group the source code belongs to; this can be any text useful to group the files later on. It is perfectly valid to put all files in the same group.

(Note: this property is mostly there for compatibility with the original SLOCCount.)

property is_countable: bool¶: True if source counts can be counted towards a total.

property language: str¶: The programming language the analyzed source code is written in; if state does not equal SourceState.analyzed this will be a pseudo language.

property source_count: int¶: number of source lines of code (the sum of code_count and string_count)

property state: SourceState¶: The state of the analysis after parsing the source file.

property state_info: str | Exception | None¶

Possible additional information about state:

SourceState.duplicate: path to the original source file the path is a duplicate of
SourceState.error: the Exception causing the error
SourceState.generated: a human readable explanation why the file is considered to be generated

property string_count: int¶: number of lines containing only strings but no other code

class pygount.SourceScanner(source_patterns, suffixes='*', folders_to_skip=None, name_to_skip=None)¶

Scanner for source code files matching certain conditions.

source_paths() → Iterator[str]¶: Paths to source code files matching all the conditions for this scanner.

class pygount.SourceState(value)¶

Possible values for SourceAnalysis.state.

analyzed = 1¶: successfully analyzed

binary = 2¶: source code is a binary

duplicate = 3¶: source code is an identical copy of another

empty = 4¶: source code is empty (file size = 0)

error = 5¶: source could not be parsed

generated = 6¶: source code has been generated

unknown = 7¶: pygments does not offer any lexer to analyze the source

pygount.encoding_for(source_path: str, encoding: str = 'automatic', fallback_encoding: str | None = None, file_handle: BufferedIOBase | RawIOBase | None = None) → str¶

The encoding used by the text file stored in source_path.

The algorithm used is:

If encoding is 'automatic, attempt the following:
1. Check BOM for UTF-8, UTF-16 and UTF-32.
2. Look for XML prolog or magic heading like # -*- coding: cp1252 -*-
3. Read the file using UTF-8.
4. If all this fails, use the fallback_encoding and ignore any further encoding errors.
If encoding is 'chardet use chardet to obtain the encoding.
For any other encoding simply use the specified value.

API¶

Overview¶

Reference¶

pygount

Navigation

Related Topics