org.gcube.application.framework.contentmanagement.datatransformation.util
Class DataTransformationUtils
java.lang.Object
org.gcube.application.framework.contentmanagement.datatransformation.util.DataTransformationUtils
public class DataTransformationUtils
- extends Object
|
Field Summary |
protected org.gcube.common.core.utils.logging.GCUBELog |
logger
|
|
Method Summary |
static ArrayList<DocumentInfos> |
getListOfFailuresFromReport(String rsLocator,
ArrayList<DocumentInfos> allDocuments,
ArrayList<String> collectionId)
It parses the reports contained in the resultset, coming from DTS and returns the list of the document URIs that failed to be transformed. |
static ArrayList<DocumentInfos> |
getReports(String rsLocator,
ArrayList<String> collectionId)
|
protected static org.gcube.common.searchservice.searchlibrary.rsreader.RSXMLReader |
getRSClient(String epr)
|
static ArrayList<String> |
performOCRtoPDF_HTTPInput(ArrayList<DocumentInfos> documents,
String outputCollectionId,
org.gcube.application.framework.core.session.ASLSession session)
Transforms a list of PDF documents to text, using OCR Service. |
static String |
transformPDFDocumentsToText(String listLocation,
ArrayList<String> collectionId,
String collectionName,
String scope)
Transforms a list of PDF documents to Text documents, using DTS. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
logger
protected final org.gcube.common.core.utils.logging.GCUBELog logger
DataTransformationUtils
public DataTransformationUtils()
transformPDFDocumentsToText
public static String transformPDFDocumentsToText(String listLocation,
ArrayList<String> collectionId,
String collectionName,
String scope)
throws ServiceEPRRetrievalException,
TransformationException
- Transforms a list of PDF documents to Text documents, using DTS. It returns an RSLocator of the resultset containing the reports for the transformations.
- Parameters:
listLocation - - the location of the file containing the document URIscollectionId - - the output collection id requested (empty if a new collection is about to be created)collectionName - - the name of the output collection id requestedscope -
- Returns:
- returns the rsLocator of the resultset, containing the reports from the transformation
- Throws:
ServiceEPRRetrievalException
TransformationException
getListOfFailuresFromReport
public static ArrayList<DocumentInfos> getListOfFailuresFromReport(String rsLocator,
ArrayList<DocumentInfos> allDocuments,
ArrayList<String> collectionId)
throws ReadingRSException
- It parses the reports contained in the resultset, coming from DTS and returns the list of the document URIs that failed to be transformed.
- Parameters:
rsLocator - - the RSLocator containing the reports from DTSallDocuments - - list of all the documents that participated in the transformation attemptcollectionId - - empty list that needs to be filled with the id of the Collection Output
- Returns:
- the documents that failed to be transformed
- Throws:
ReadingRSException
getReports
public static ArrayList<DocumentInfos> getReports(String rsLocator,
ArrayList<String> collectionId)
throws ReadingRSException
- Throws:
ReadingRSException
getRSClient
protected static org.gcube.common.searchservice.searchlibrary.rsreader.RSXMLReader getRSClient(String epr)
throws ReadingRSException
- Throws:
ReadingRSException
performOCRtoPDF_HTTPInput
public static ArrayList<String> performOCRtoPDF_HTTPInput(ArrayList<DocumentInfos> documents,
String outputCollectionId,
org.gcube.application.framework.core.session.ASLSession session)
throws ServiceEPRRetrievalException,
OCRException
- Transforms a list of PDF documents to text, using OCR Service. It returns a list of the CM URIs of the output documents.
It also copies the generated output to the collection given as a parameter.
- Parameters:
documents - - the list of documents to be transformedoutpuCollectionId - - the collection to which the output will be insertedsession -
- Returns:
- - list of CM URIs of transformed documents
- Throws:
ServiceEPRRetrievalException
OCRException
Copyright © 2013. All Rights Reserved.