Hello,
I’m trying to do some data mapping configuration to extract data from pdf files.
Some kind of pdf files won’t open and produce errors.
Somebody knows what kind of pdf files are supported from the data mapper ?
To be more precise:
pdf version: 1.4 … 1.9
font type: truetype, type 1 …
font encoding: ANSI, CID …
font embedding: yes, no
other: layer …
ERROR [24 Feb 2015 14:32:19,496][ModalContext] com.objectiflune.weaver.textextraction.rest.client.ExtractionRestClient.getPage(?:?) POST http://localhost:51316/rest/weaverengine/extractor/getPage/C:\Users\Administrator.vmwin7ol\Connect\temp\Connectdesigner\1740\inputdata.8300413086049324830.pdf/0 returned a response status of 500 Internal Server Error
ERROR [24 Feb 2015 14:32:19,673][main] com.objectiflune.datamining.ui.model.DataMiningModel.loadDocument(?:?) [COMPONENT=Data Mapping][SOURCE=Internal] Unable to open the document 1 (DME000049)
java.lang.Exception: com.objectiflune.datamining.pdf.pdfengine.textextract.TextExtractorException: Error while retrieving character data (DME000165)
at com.objectiflune.datamining.ui.model.DataMiningModel.loadDocument(Unknown Source)
at com.objectiflune.datamining.ui.model.DataMiningModel.setDocumentIndex(Unknown Source)
at com.objectiflune.datamining.ui.model.DataMiningModel.updateDocumentCount(Unknown Source)
at com.objectiflune.datamining.ui.model.RefreshBoundariesJob$1.run(Unknown Source)
at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35)
at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:135)
at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:4144)
at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3761)
at org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:2701)
at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:2665)
at org.eclipse.ui.internal.Workbench.access$4(Workbench.java:2499)
at org.eclipse.ui.internal.Workbench$7.run(Workbench.java:679)
at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332)
at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:668)
at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149)
at com.objectiflune.application.Application.start(Unknown Source)
at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:353)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:629)
at org.eclipse.equinox.launcher.Main.basicRun(Main.java:584)
at org.eclipse.equinox.launcher.Main.run(Main.java:1438)
at org.eclipse.equinox.launcher.Main.main(Main.java:1414)
Caused by: com.objectiflune.datamining.pdf.pdfengine.textextract.TextExtractorException: Error while retrieving character data (DME000165)
at com.objectiflune.datamining.pdf.pdfengine.textextract.internal.WeaverExtractorEngine.analyze(Unknown Source)
at com.objectiflune.datamining.pdf.data.PDFDocumentData.analyzePage(Unknown Source)
at com.objectiflune.datamining.pdf.data.PDFDocumentData.reset(Unknown Source)
at com.objectiflune.datamining.pdf.data.PDFDocumentData.open(Unknown Source)
at com.objectiflune.datamining.Document.open(Unknown Source)
at com.objectiflune.datamining.ui.model.DataMiningModel$LoadDocumentRunnable.run(Unknown Source)
at org.eclipse.jface.operation.ModalContext$ModalContextThread.run(ModalContext.java:121)
Caused by: java.lang.NullPointerException
at nl.edmond.weaver.api.textextraction.TextAnalyzer.analyzePage(Unknown Source)
… 7 more