pipeline status codecov coverage report

Elastic CSV Loader

This command line utility loads csv file into an elasticsearch index, using a provided yaml config file.

load-csv considerations:

  • CSV files MUST include a header with field names

  • Header field names will be used as elastic index fields

  • A @timestamp and date fields will be added to all indexed docs

    • A date logic date could be forced through command parameter.

  • Depending on elastic_index.data_format.parent_data_object value, all original csv header fields will be arranged under a data parent object.

Indexed data will use the same field names that

download-index considerations:

  • If csv file is an existing file the download process will append data including headers

  • You have to rename or delete previous csv file if you want to start fresh.

Install

Dependencies

  • Python 3.10 or higher

  • Package manager

pip install --upgrade elasticcsv

Run

Elastic Connection Config

Connection configuration is based in a YAML text file (connection.yaml) that must be present in at least one of the following locations:

  • The same directory where command csv2es is run. (takes precedence if exists)

  • <home>/.config/elasticcsv/connection.yaml (default, will be created if not exists with a sample config)

Sample connection.yaml

elastic_connection:
  proxies:
    http: "http://user:pass@proxy.url:8080"
    https: "http://user:pass@proxy.url:8080"
  user: myuser
  password: mypassword
  apikey_id: myapikey  # apikeys auth takes precedence over user/password
  apikey_secret: myapikeysecret
  node: my.elastic.node
  schema: https
  port: 443
elastic_index:
  data_format:
    parent_data_object: true

Run command

❯ csv2es load-csv --help
Usage: csv2es load-csv [OPTIONS]

  Loads csv to elastic index


╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --csv                       PATH                                            CSV File [default: None] [required]                                                                                                               │
│ *  --index                     TEXT                                            Elastic Index [default: None] [required]                                                                                                          │
│    --logic-date                [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S]  Date reference for interfaces [default: None]                                                                                                     │
│    --csv-date-format           TEXT                                            date format for *_date columns as for ex: '%Y-%m-%d' [default: %Y-%m-%d]                                                                          │
│    --sep                       TEXT                                            CSV field sepator [default: ;]                                                                                                                    │
│    --csv-offset                INTEGER                                         CSV file offset [default: 0]                                                                                                                      │
│    --delete-if-exists  -d                                                      Flag for deleting index before running load                                                                                                       │
│    --dict-columns              TEXT                                            Comma separated list of colums of type dict to load as dicts [default: None]                                                                      │
│    --help                                                                      Show this message and exit.                                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Python date formats references: String Format Time

❯ csv2es download-index --help
Usage: csv2es download-index [OPTIONS]

  Download index to csv file

Options:
  --csv PATH              CSV File  [required]
  --sep TEXT              CSV field sepator  [required]
  --index TEXT            Elastic Index  [required]
  -d, --delete-if-exists  Flag for deleting csv file before download
  --help                  Show this message and exit.

Example:

csv2es load-csv --csv ./pathtomyfile/file.csv --index myindex --sep ";"

csv2es download-index --csv ./pathtomyfile/file.csv --index myindex --sep ";" -d