nesta Logo
latest

Contents:

  • Packages
  • Production
    • How to put code into production at nesta
    • Code and scripts
      • Routines
      • Batchables
        • Data / project specific batchables
        • General-purpose batchables
      • ORMs
      • Ontologies and schemas
      • Luigi Hacks
      • Scripts
      • Scripts
      • Elasticsearch
      • Containerised Luigi
  • AWS FAQ
  • Troubleshooting
nesta
  • Docs »
  • Production »
  • Batchables
  • Edit on GitHub

Batchables¶

Packets of code to be batched by core.routines routines. Each packet should sit in it’s own directory, with a file called run.py, containing a ‘main’ function called run() which will be executed on the AWS batch system.

Each run.py should expect an environment parameter called BATCHPAR_outfile which should provide information on the output location. Other input parameters should be prefixed with BATCHPAR_, as set in core.routines routine.

Data / project specific batchables¶

  • Example
    • run.py (batch_example)
    • run.py (template_batchable)
  • arXiv data (technical research)
    • run.py (arxiv_elasticsearch)
  • CORDIS (EU-funded research)
    • run.py (cordis_api)
  • Crunchbase data (private companies)
    • run.py (crunchbase_collect)
    • run.py (crunchbase_elasticsearch)
  • EURITO
    • run.py (arxiv_eu)
    • run.py (crunchbase_eu)
    • run.py (patstat_eu)
  • GtR (UK publicly funded research)
    • run.py (collect_gtr)
    • run.py (embed_topics)
  • NiH data (health research)
    • run.py (nih_collect_data)
    • run.py (nih_process_data)
    • run.py (nih_abstract_mesh_data)
    • run.py (nih_dedupe)
  • Meetup (social networking / knowledge exchange)
    • run.py (country_groups)
    • run.py (groups_members)
    • run.py (members_groups)
    • run.py (group_details)
    • run.py (topic_tag_elasticsearch)

General-purpose batchables¶

  • Bulk geocoding
    • run.py (batch_geocode)
  • Natural Language Processing
    • [AutoML*] run.py (corex_topic_model)
    • [AutoML] run.py (ngrammer)
    • [AutoML] run.py (tfidf)
    • [AutoML] vectorizer (run.py)
  • Novelty
    • run.py (lolvelty)
Next Previous

© Copyright 2018, nesta Revision 8bb8d8b5.

Built with Sphinx using a theme provided by Read the Docs.