Class DBLPResultAnalyzer
java.lang.Object
eu.openaire.dblp_benchmark.DBLPResultAnalyzer
- All Implemented Interfaces:
Serializable
Analyzes false positives and false negatives produced by OpenAIRE (AIDER) name matching.
Input: the joined ORCID/DBLP parquet cache produced by DBLPBenchmark (the path
<output>_joined_cache from that job).
Output (JSON): one record per erroneous classification:
- FP (false positive): OpenAIRE matched a DBLP author to an ORCID author whose ORCID does not match the ground-truth ORCID stored in the DBLP record.
- FN (false negative): OpenAIRE failed to match a DBLP author that has a ground-truth ORCID, but only when ORCID-ID matching would have succeeded (i.e. the correct author IS present in the ORCID dataset for that DOI). FNs where the correct author is absent from the ORCID dataset are excluded because those are data-coverage gaps, not algorithm failures.
Usage:
spark-submit --class eu.openaire.dblp_enricher.DBLPFalsePositiveAnalyzer \
target/dblp-orcid-benchmark-0.1.1.jar \
--cachePath <output>_joined_cache \
--output <output-path>
- See Also:
-
Constructor Summary
Constructors -
Method Summary
-
Constructor Details
-
DBLPResultAnalyzer
public DBLPResultAnalyzer()
-
-
Method Details
-
main
-