Command-line interface#
swh indexer#
Software Heritage Indexer tools.
The Indexer is used to mine the content of the archive and extract derived information from archive source code artifacts.
swh indexer [OPTIONS] COMMAND [ARGS]...
Options
- -C, --config-file <config_file>#
- Configuration file. 
journal-client#
Listens for new objects from the SWH Journal, and runs the indexer with the name passed as argument
Passing ‘*’ as indexer name runs all indexers.
swh indexer journal-client [OPTIONS] {origin_intrinsic_metadata|extrinsic_meta
                           data|content_mimetype|content_fossology_license|*}
Options
- --broker <brokers>#
- Kafka broker to connect to. 
- --prefix <prefix>#
- Prefix of Kafka topic names to read from. 
- --group-id <group_id>#
- Consumer/group id for reading from Kafka. 
- -m, --stop-after-objects <stop_after_objects>#
- Maximum number of objects to replay. Default is to run forever. 
- -b, --batch-size <batch_size>#
- Batch size. Default is 200. 
Arguments
- INDEXER#
- Required argument 
mapping#
Manage Software Heritage Indexer mappings.
swh indexer mapping [OPTIONS] COMMAND [ARGS]...
list#
Prints the list of known mappings.
swh indexer mapping list [OPTIONS]
list-terms#
Prints the list of known CodeMeta terms, and which mappings support them.
swh indexer mapping list-terms [OPTIONS]
Options
- --exclude-mapping <exclude_mapping>#
- Exclude the given mapping from the output 
- --concise#
- Don’t print the list of mappings supporting each term. 
translate#
Translates file from mapping-name to codemeta format.
swh indexer mapping translate [OPTIONS] MAPPING_NAME FILE
Arguments
- MAPPING_NAME#
- Required argument 
- FILE#
- Required argument 
rpc-serve#
Starts a Software Heritage Indexer RPC HTTP server.
swh indexer rpc-serve [OPTIONS] CONFIG_PATH
Options
- --host <host>#
- Host to run the server 
- --port <port>#
- Binding port of the server 
- --debug, --nodebug#
- Indicates if the server should run in debug mode 
Arguments
- CONFIG_PATH#
- Required argument