API¶
Overview¶
Pygount provides a simple API to integrate in other tools. This however is currently still a work in progress and subject to change.
Here’s an example on how to analyze one of pygount’s own source codes:
>>> from pygount import SourceAnalysis
>>> SourceAnalysis.from_file("pygount/analysis.py", "pygount")
SourceAnalysis(path='pygount/analysis.py', language='Python', group='pygount', state=analyzed, code_count=509, documentation_count=141, empty_count=117, string_count=23)
Information about multiple source files can be summarize using
ProjectSummary
:
First, set up the summary:
>>> from pygount import ProjectSummary
>>> project_summary = ProjectSummary()
Next, find some files to analyze:
>>> from glob import glob
>>> source_paths = glob("pygount/*.py") + glob("*.md")
>>> source_paths
['pygount/command.py', 'pygount/analysis.py', 'pygount/write.py', 'pygount/__init__.py', 'pygount/xmldialect.py', 'pygount/summary.py', 'pygount/common.py', 'pygount/lexers.py', 'README.md', 'CONTRIBUTING.md', 'CHANGES.md']
Then analyze them:
>>> for source_path in source_paths:
... source_analysis = SourceAnalysis.from_file(source_path, "pygount")
... project_summary.add(source_analysis)
Finally, take a look at the information collected, for example by printing the values of
ProjectSummary.language_to_language_summary_map
>>> for language_summary in project_summary.language_to_language_summary_map.values():
... print(language_summary)
...
LanguageSummary(language='Python', file_count=8, code=1232, documentation=295, empty=331, string=84)
LanguageSummary(language='markdown', file_count=3, code=64, documentation=0, empty=29, string=14)
Reference¶
Pygount counts lines of source code using pygments lexers.
- class pygount.DuplicatePool¶
A pool that collects information about potential duplicate files.
- duplicate_path(source_path: str) str | None ¶
Path to a duplicate for
source_path
orNone
if no duplicate exists.Internally information is stored to identify possible future duplicates of
source_path
.
- exception pygount.Error¶
Error to indicate that something went wrong during a pygount run.
- class pygount.LanguageSummary(language: str)¶
Summary of a source code counts from multiple files of the same language.
- add(source_analysis: SourceAnalysis) None ¶
Add counts from
source_analysis
to total counts for this language.
- property code_count: int¶
sum lines of code for this language
- property code_percentage: float¶
percentage of lines containing code for this language across entire project
- property documentation_count: int¶
sum lines of documentation for this language
- property documentation_percentage: float¶
percentage of lines containing documentation for this language across entire project
- property empty_count: int¶
sum empty lines for this language
- property empty_percentage: float¶
percentage of empty lines for this language across entire project
- property file_count: int¶
number of source code files for this language
- property file_percentage: float¶
percentage of files in project
- property is_pseudo_language: bool¶
True
if the language is not a real programming language
- property language: str¶
the language to be summarized
- property line_count: int¶
sum count of all lines of any kind for this language
- sort_key() Hashable ¶
sort key to sort multiple languages by importance
- property source_count: int¶
sum number of source lines of code
- property source_percentage: float¶
percentage of source lines for code for this language across entire project
- property string_count: int¶
sum number of lines containing strings for this language
- property string_percentage: float¶
percentage of lines containing strings for this language across entire project
- exception pygount.OptionError(message, source=None)¶
Error to indicate that a value passed to a command line option must be fixed.
- class pygount.ProjectSummary¶
Summary of source code counts for several languages and files.
- add(source_analysis: SourceAnalysis) None ¶
Add counts from
source_analysis
to total counts.
- property language_to_language_summary_map: Dict[str, LanguageSummary]¶
A map containing summarized counts for each language added with
add()
so far.
- update_file_percentages() None ¶
Update percentages for all languages part of the project.
- class pygount.SourceAnalysis(path: str, language: str, group: str, code: int, documentation: int, empty: int, string: int, state: SourceState, state_info: str | None = None)¶
Results from analyzing a source path.
Prefer the factory methods
from_file()
andfrom_state()
to calling the constructor.- property code_count: int¶
number of lines containing code
- property documentation_count: int¶
number of lines containing documentation (resp. comments)
- property empty_count: int¶
number of empty lines, including lines containing only white space, white characters or white code words
See also:
white_characters()
,white_code_words()
- static from_file(source_path: str, group: str, encoding: str = 'automatic', fallback_encoding: str = 'cp1252', generated_regexes=[re.compile('(?i).*automatically generated', re.IGNORECASE), re.compile('(?i).*do not edit', re.IGNORECASE), re.compile('(?i).*generated with the .+ utility', re.IGNORECASE), re.compile('(?i).*this is a generated file', re.IGNORECASE), re.compile('(?i).*generated automatically', re.IGNORECASE)], duplicate_pool: DuplicatePool | None = None, file_handle: IOBase | None = None) SourceAnalysis ¶
Factory method to create a
SourceAnalysis
by analyzing the source code insource_path
or the open filefile_handle
.- Parameters:
source_path – path to source code to analyze
group – name of a logical group the sourc code belongs to, e.g. a package.
encoding – encoding according to
encoding_for()
fallback_encoding – fallback encoding according to
encoding_for()
generated_regexes – list of regular expression that if found within the first few lines if a source code identify is as generated source code for which SLOC should not be counted
duplicate_pool – a
DuplicatePool
where information about possible duplicates is collected, orNone
if possible duplicates should be counted multiple times.file_handle – a file-like object, or
None
to read and open the file fromsource_path
. If the file is open in text mode, it must be opened with the correct encoding.
- static from_state(source_path: str, group: str, state: SourceState, state_info: str | None = None) SourceAnalysis ¶
Factory method to create a
SourceAnalysis
with all counts set to 0 and everything else according to the specified parameters.
- property group: str¶
Group the source code belongs to; this can be any text useful to group the files later on. It is perfectly valid to put all files in the same group.
(Note: this property is mostly there for compatibility with the original SLOCCount.)
- property is_countable: bool¶
True
if source counts can be counted towards a total.
- property language: str¶
The programming language the analyzed source code is written in; if
state
does not equalSourceState.analyzed
this will be a pseudo language.
- property source_count: int¶
number of source lines of code (the sum of code_count and string_count)
- property state: SourceState¶
The state of the analysis after parsing the source file.
- property state_info: str | Exception | None¶
Possible additional information about
state
:SourceState.duplicate
: path to the original source file thepath
is a duplicate ofSourceState.error
: theException
causing the errorSourceState.generated
: a human readable explanation why the file is considered to be generated
- property string_count: int¶
number of lines containing only strings but no other code
- class pygount.SourceScanner(source_patterns, suffixes='*', folders_to_skip=None, name_to_skip=None)¶
Scanner for source code files matching certain conditions.
- source_paths() Iterator[str] ¶
Paths to source code files matching all the conditions for this scanner.
- class pygount.SourceState(value)¶
Possible values for
SourceAnalysis.state
.- analyzed = 1¶
successfully analyzed
- binary = 2¶
source code is a binary
- duplicate = 3¶
source code is an identical copy of another
- empty = 4¶
source code is empty (file size = 0)
- error = 5¶
source could not be parsed
- generated = 6¶
source code has been generated
- unknown = 7¶
pygments does not offer any lexer to analyze the source
- pygount.encoding_for(source_path: str, encoding: str = 'automatic', fallback_encoding: str | None = None, file_handle: BufferedIOBase | RawIOBase | None = None) str ¶
The encoding used by the text file stored in
source_path
.The algorithm used is:
If
encoding
is'automatic
, attempt the following:Check BOM for UTF-8, UTF-16 and UTF-32.
Look for XML prolog or magic heading like
# -*- coding: cp1252 -*-
Read the file using UTF-8.
If all this fails, use the
fallback_encoding
and ignore any further encoding errors.
If
encoding
is'chardet
usechardet
to obtain the encoding.For any other
encoding
simply use the specified value.