public class DataTransformationUtils extends Object
| Constructor and Description |
|---|
DataTransformationUtils() |
| Modifier and Type | Method and Description |
|---|---|
static ArrayList<DocumentInfos> |
getListOfFailuresFromReport(String rsLocator,
ArrayList<DocumentInfos> allDocuments,
ArrayList<String> collectionId)
It parses the reports contained in the resultset, coming from DTS and returns the list of the document URIs that failed to be transformed.
|
static ArrayList<DocumentInfos> |
getReports(String rsLocator,
ArrayList<String> collectionId) |
protected static org.gcube.common.searchservice.searchlibrary.rsreader.RSXMLReader |
getRSClient(String epr) |
static ArrayList<String> |
performOCRtoPDF_HTTPInput(ArrayList<DocumentInfos> documents,
String outputCollectionId,
org.gcube.application.framework.core.session.ASLSession session)
Transforms a list of PDF documents to text, using OCR Service.
|
static String |
transformPDFDocumentsToText(String listLocation,
ArrayList<String> collectionId,
String collectionName,
String scope)
Transforms a list of PDF documents to Text documents, using DTS.
|
public static String transformPDFDocumentsToText(String listLocation, ArrayList<String> collectionId, String collectionName, String scope) throws ServiceEPRRetrievalException, TransformationException
listLocation - - the location of the file containing the document URIscollectionId - - the output collection id requested (empty if a new collection is about to be created)collectionName - - the name of the output collection id requestedscope - ServiceEPRRetrievalExceptionTransformationExceptionpublic static ArrayList<DocumentInfos> getListOfFailuresFromReport(String rsLocator, ArrayList<DocumentInfos> allDocuments, ArrayList<String> collectionId) throws ReadingRSException
rsLocator - - the RSLocator containing the reports from DTSallDocuments - - list of all the documents that participated in the transformation attemptcollectionId - - empty list that needs to be filled with the id of the Collection OutputReadingRSExceptionpublic static ArrayList<DocumentInfos> getReports(String rsLocator, ArrayList<String> collectionId) throws ReadingRSException
ReadingRSExceptionprotected static org.gcube.common.searchservice.searchlibrary.rsreader.RSXMLReader getRSClient(String epr) throws ReadingRSException
ReadingRSExceptionpublic static ArrayList<String> performOCRtoPDF_HTTPInput(ArrayList<DocumentInfos> documents, String outputCollectionId, org.gcube.application.framework.core.session.ASLSession session) throws ServiceEPRRetrievalException, OCRException
documents - - the list of documents to be transformedoutpuCollectionId - - the collection to which the output will be insertedsession - ServiceEPRRetrievalExceptionOCRExceptionCopyright © 2013. All Rights Reserved.