Package eu.dnetlib.dhp.schema.solr
Class CitationCountByYearMapperUtil
java.lang.Object
eu.dnetlib.dhp.schema.solr.CitationCountByYearMapperUtil
Example mapper utilities for populating citationCountByYear on Result objects.
This class demonstrates the integration pattern for mapper/importer modules that consume
citation timeline data from upstream sources and populate the Result model.
Usage in a real mapper (pseudo-code):
// In dhp-hadoop or similar mapper module:
public Result mapCitation(UpstreamCitationRecord record, Result result) {
List<CitationCountByYear> raw = extractCitationsByYear(record);
List<CitationCountByYear> canonical = CitationCountByYearMapperUtil.populateCitations(raw);
result.setCitationCountByYear(canonical);
return result;
}
private List<CitationCountByYear> extractCitationsByYear(UpstreamCitationRecord record) {
// Parse from JSON, Avro, or database
return record.getCitations().stream()
.map(c -> CitationCountByYear.newInstance(c.getYear(), c.getCount()))
.collect(Collectors.toList());
}
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfaceCallback interface for tracking invalid entries during mapping. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic List<CitationCountByYear>populateCitations(List<CitationCountByYear> rawCitations) Populates and canonicalizes a citation-by-year list for use in a Result object.static List<CitationCountByYear>populateCitationsWithTracking(List<CitationCountByYear> rawCitations, CitationCountByYearMapperUtil.InvalidEntryHandler invalidEntryHandler) Alternative: if you want to track/log invalid entries.static booleanvalidateResult(Result result) Validates that a Result object has well-formed citation data (if present).
-
Constructor Details
-
CitationCountByYearMapperUtil
public CitationCountByYearMapperUtil()
-
-
Method Details
-
populateCitations
Populates and canonicalizes a citation-by-year list for use in a Result object. This is the primary integration point for mappers: - Normalizes input (may be null, unsorted, contain duplicates, invalid entries) - Returns a canonical list ready for indexing - Logs/tracks invalid entries for monitoring (optional)- Parameters:
rawCitations- Raw citation entries from upstream source.- Returns:
- Canonicalized citation list suitable for Result.setCitationCountByYear().
-
populateCitationsWithTracking
public static List<CitationCountByYear> populateCitationsWithTracking(List<CitationCountByYear> rawCitations, CitationCountByYearMapperUtil.InvalidEntryHandler invalidEntryHandler) Alternative: if you want to track/log invalid entries.- Parameters:
rawCitations- Raw citation entries from upstream source.invalidEntryHandler- Optional callback to handle invalid entries (e.g., for metrics/logging).- Returns:
- Canonicalized citation list suitable for Result.setCitationCountByYear().
-
validateResult
Validates that a Result object has well-formed citation data (if present). Can be used in post-mapping validation or in result processors.- Parameters:
result- Result to validate.- Returns:
- true if citationCountByYear is null/empty or canonical; false if malformed.
-