Elastic CSV Loader¶
This command line utility loads csv file into an elasticsearch index, using a provided yaml config file.
load-csv considerations:
CSV files MUST include a header with field names
Header field names will be used as elastic index fields
A
@timestampanddatefields will be added to all indexed docsA
datelogic date could be forced through command parameter.
Depending on
elastic_index.data_format.parent_data_objectvalue, all original csv header fields will be arranged under adataparent object.
Indexed data will use the same field names that
download-index considerations:
If csv file is an existing file the download process will append data including headers
You have to rename or delete previous csv file if you want to start fresh.
Install¶
Dependencies¶
Python3.10 or higherPackage manager
pip install --upgrade elasticcsv
Run¶
Elastic Connection Config¶
Connection configuration is based in a YAML text file (connection.yaml) that must be present in at
least one of the following locations:
The same directory where command
csv2esis run. (takes precedence if exists)<home>/.config/elasticcsv/connection.yaml(default, will be created if not exists with a sample config)
Sample connection.yaml
elastic_connection:
proxies:
http: "http://user:pass@proxy.url:8080"
https: "http://user:pass@proxy.url:8080"
user: myuser
password: mypassword
apikey_id: myapikey # apikeys auth takes precedence over user/password
apikey_secret: myapikeysecret
node: my.elastic.node
schema: https
port: 443
elastic_index:
data_format:
parent_data_object: true
Run command¶
❯ csv2es load-csv --help
Usage: csv2es load-csv [OPTIONS]
Loads csv to elastic index
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * --csv PATH CSV File [default: None] [required] │
│ * --index TEXT Elastic Index [default: None] [required] │
│ --logic-date [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S] Date reference for interfaces [default: None] │
│ --csv-date-format TEXT date format for *_date columns as for ex: '%Y-%m-%d' [default: %Y-%m-%d] │
│ --sep TEXT CSV field sepator [default: ;] │
│ --csv-offset INTEGER CSV file offset [default: 0] │
│ --delete-if-exists -d Flag for deleting index before running load │
│ --dict-columns TEXT Comma separated list of colums of type dict to load as dicts [default: None] │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Python date formats references: String Format Time
❯ csv2es download-index --help
Usage: csv2es download-index [OPTIONS]
Download index to csv file
Options:
--csv PATH CSV File [required]
--sep TEXT CSV field sepator [required]
--index TEXT Elastic Index [required]
-d, --delete-if-exists Flag for deleting csv file before download
--help Show this message and exit.
Example:
csv2es load-csv --csv ./pathtomyfile/file.csv --index myindex --sep ";"
csv2es download-index --csv ./pathtomyfile/file.csv --index myindex --sep ";" -d