6. Case Study: VaSyR 2018#

In this section, we present an end-to-end example using a sample dataset.

6.1. Query generation#

Each of the six queries returned X potentially relevant papers.

6.2. Semantic Search#

Each of the six queries returned X potentially relevant papers.

6.3. Topic Modeling#

Out of the XX papers returned across the six query types, YYY had abstracts available in either the NLP4Dev or Semantic Scholar corpuses, and ZZZ had full body text available.

These were used as inputs to the NLP4Dev API, as described in Section 3.5.

XXX were defined as relevant.

6.4. Model Output#

Following manual review of the papers identified automatically, XXX were confirmed as relevant. Of these XXX, YYY had previously been identified through the manual procedure, and ZZZ were new.

This represents a XXX percent improvement over the baseline.

6.5. Evaluating Model Performance#

Based on our selected evaluation metric, Query X performed most effectively

6.6. Network analysis#

Of the papers that referenced this dataset, XYZ came from ABC geographies and ABC instutions.

Microdata Citation Explorer project

Case Study: VaSyR 2018

Contents