Este artículo le mostrará cómo extraer tablas de documentos PDF^{(extract tables from PDF documents)} . Es posible que tenga muchos archivos PDF que contengan varias tablas que desee utilizar por separado. Copiar^(Copying) y pegar esas tablas no es una buena opción, ya que es posible que no proporcione el resultado esperado, por lo tanto, necesita otras opciones simples que puedan extraer tablas de un archivo PDF y guardar esas tablas como archivos separados.

La mayoría de estas herramientas de extracción de tablas PDF^{(PDF table extractor tools)} no pueden ayudar si se escanea la tabla PDF . En tal caso, primero debe hacer que el PDF se pueda buscar^{(make the PDF searchable)} y luego probar estas opciones.

Extraer tablas de documentos PDF

En esta publicación, hemos agregado 2 servicios en línea gratuitos y 3 software gratuitos para extraer tablas de un archivo PDF :

PDF a XLS
PDFtoExcel.com
Tabula
ByteScout PDF Multiherramienta
Sejda PDF Desktop.

1] PDF a XLS

Servicio PDF a XLS

PDF to XLS es una de las mejores opciones para extraer tablas de PDF . Tiene dos características que lo hacen práctico. Puede obtener tablas de 20 documentos PDF^{(20 PDF)} juntos. Además, la extracción de la tabla PDF es automática. Genera la salida como un archivo XLSX . Si un PDF tiene varias tablas, cada tabla se almacena por separado en diferentes hojas del archivo XLSX de salida .

Abra la página^{(Open the homepage)} de inicio de este servicio. Después de eso, arrastre y suelte los archivos PDF o use el botón CARGAR ARCHIVOS^{(UPLOAD FILES)} . Cada PDF cargado se convierte automáticamente a un archivo de formato XLSX. Cuando los archivos de salida estén listos, puede descargarlos uno por uno o descargar un archivo ZIP que contendrá todos los archivos de salida.

2] PDFtoExcel.com

Extraer tablas de documentos PDF

El servicio PDFtoExcel.com^{(PDFtoExcel.com)} puede extraer tablas de un PDF a la vez, pero admite múltiples plataformas para cargar PDF . Es compatible con las plataformas OneDrive , escritorio^(desktop) , Google Drive y Dropbox para cargar un PDF . Además, el proceso de conversión es automático.

La página de inicio de este servicio está aquí^(here) . Allí, seleccione una opción de carga para agregar PDF . Después de eso, carga y convierte automáticamente el archivo PDF a Excel ( XLSX ). Cuando la salida esté lista, obtendrá el enlace de descarga para guardar el archivo de salida que contiene la(s) tabla(s) PDF .

Nota:^{(Note: )} aunque este servicio menciona que también puede extraer tablas de archivos PDF escaneados , no funcionó para mí. Todavía puede probarlo para PDF escaneado .

3] Tabula

Software de extracción de PDF Tabula

Tabula es un software poderoso que puede detectar automáticamente las tablas presentes en un PDF y luego le permite guardar esas tablas como archivos TSV , JSON o CSV . Puede seleccionar la opción de guardar archivos CSV separados para cada tabla ^(CSV)PDF o guardar todas las tablas en un solo archivo CSV .

Para descargar este extractor de tablas PDF de código abierto , ^{(open-source)}haga clic aquí^{(click here)} . También requiere Java^{(requires Java)} para ejecutarlo y usarlo con éxito.

Extraiga el archivo ZIP que descargó y ejecute el archivo tabula.exe . Se abrirá una página en su navegador predeterminado. Si la página no está abierta, agregue http://localhost:8080 en su navegador y presione Entrar^(Enter) .

Ahora verás su interfaz donde puedes usar la opción Examinar^(Browse) para agregar un PDF . Después de eso, presione el botón Importar^(Import) . Cuando se agrega el PDF , puede ver páginas PDF en su interfaz.^(PDF)

Use el botón Autodetectar tablas^{(Autodetect Tables)} y resaltará todas las tablas presentes en ese PDF automáticamente. También puede resaltar manualmente una tabla seleccionando una tabla en particular. Si lo desea, también puede eliminar las tablas seleccionadas^{(remove selected tables)} de su elección.

Esto le ayudará a guardar solo las tablas que desee. Cuando las tablas PDF estén resaltadas, haga clic en el botón Vista previa y exportación de datos extraídos^{(Preview & Export Extracted Data)} .

Finalmente, use el menú desplegable disponible en la parte superior para seleccionar un formato de salida y presione el botón Exportar^(Export) . Esto guardará las tablas PDF en el archivo de formato de salida seleccionado por usted.

4] ByteScout PDF Multiherramienta

ByteScout PDF Multitool: agregue pdf y detecte tablas

Como sugiere el nombre, este software viene con múltiples herramientas. Tiene herramientas como convertir PDF a TIFF de varias páginas^{(convert PDF to multipage TIFF)} , rotar documentos PDF^{(rotate PDF document)} , hacer que PDF no se pueda buscar^{(make PDF unsearchable)} , optimizar PDF^{(optimize PDF)} , agregar una imagen a PDF^{(add an image to PDF)} y más. La función de detector de tablas PDF^(PDF) también está ahí, lo cual es bastante impresionante. La ventaja de esta herramienta es que también puede extraer tablas de PDF escaneados^{(extract tables from scanned PDF)} . Puede detectar tablas en varias páginas y luego extraer esas tablas como archivo de formato CSV , XLS , XML , TXT o JSON . Antes de la extracción, también le permite establecer un rango de páginas^{(page range)}para extraer tablas solo de páginas específicas.

Puede obtener este software aquí^(here) . Es gratis^{(free for non-commercial use)} solo para uso no comercial. Después de la instalación, ejecute este software y use la opción Abrir documento^{(Open Document)} para agregar un PDF . Después de eso, haga clic en la herramienta Detectar tablas como se destaca en la imagen de arriba. ^{(Detect tables)}Esta herramienta está presente en la categoría Extracción de datos^{(Data Extraction)} .

Se abrirá un cuadro donde puede establecer condiciones para detectar tablas. Por ejemplo, puede establecer un número mínimo de columnas, filas, saltos de línea mínimos entre tablas, establecer el modo de detección de tablas en tablas con bordes o sin bordes, etc. Use opciones o mantenga la configuración predeterminada.

Después de eso, presione el botón Detectar siguiente tabla^{(Detect next table)} en ese cuadro. Identificará y seleccionará una tabla en la página actual. De esa manera, puede pasar a otra página y detectar más tablas.

detectar tablas y guardar tablas pdf con salida seleccionada

Cuando haya terminado, use el botón Proceder a la extracción^{(Proceed to extraction)} y seleccione el formato de salida. Finalmente, puede usar las opciones para guardar las tablas de la página actual o definir un rango de páginas y guardar la salida.

La herramienta da un resultado satisfactorio. Pero a veces, puede detectar otro contenido en PDF y es posible que no pueda extraer tablas de varias páginas. En ese caso, debe usarlo para buscar y guardar tablas una por una.

5] Sejda PDF Escritorio

Sejda PDF Desktop con convertidor de pdf a excel

Sejda PDF Desktop también es un software multipropósito. Puede optimizar o comprimir PDF^{(compress PDF)} , agregar marcas de agua a PDF, eliminar restricciones de PDF^{(remove restrictions from PDF)} , editar documentos PDF , etc. Sin embargo, su plan gratuito tiene limitaciones. En el plan gratuito solo se pueden realizar 3 tareas por día. Además, el límite de tamaño del PDF es de ^(PDF)50 MB o 10 páginas^{(10 pages)} .

Puede usar su herramienta de conversión de PDF a Excel^{(PDF to Excel)} para extraer tablas PDF . Detecta automáticamente las tablas en páginas PDF y le permite guardar esas tablas como XLSX o CSV .

Su enlace de descarga está aquí^(here) . Después de la instalación, use la herramienta PDF a Excel desde su interfaz principal. Después de seleccionar esa herramienta, use el botón Elegir archivos PDF^{(Choose PDF files)} . Solo se puede agregar un PDF al plan gratuito.

Cuando se agrega el PDF , proporcionará los botones Convertir PDF a CSV^{(Convert PDF to CSV)} y Convertir PDF a Excel . ^{(Convert PDF to Excel)}Use un botón y luego puede guardar la salida en la ubicación deseada en su PC.

convertir tabla pdf a excel o csv

Su herramienta de detección de tablas PDF es buena. ^(PDF)No es necesario que detecte manualmente las tablas. Aún así, a veces puede incluir otro contenido de texto como una tabla PDF y almacenarlo en la salida. Pero los resultados generales son buenos.

Eso es todo.

Estas son algunas buenas herramientas para extraer tablas de PDF . El software Tabula^(Tabula) es más efectivo que otras herramientas. Aún así, puede probar todas las herramientas y verificar cuál ayuda.

Lecturas similares:^{(Similar reads:)}

Extraer archivos adjuntos de PDF^{(Extract attachments from PDF)}
Extraiga el texto resaltado de un PDF^{(Extract highlighted text from PDF)} .

How to extract Tables from PDF documents

This article will show you how to extract tables from PDF documents. You might have many PDF files that contain multiple tables that you want to use separately. Copying and pasting those tables is not a good option as it may not give the expected output, therefore you need some other simple options that can extract tables from a PDF file and save those tables as separate files.

Most of these PDF table extractor tools can’t help if the PDF table is scanned. In such a case, you should first make the PDF searchable and then try these options.

Extract Tables from PDF documents

In this post, we have added 2 free online services and 3 free software to extract tables from a PDF file:

PDF to XLS
PDFtoExcel.com
Tabula
ByteScout PDF Multitool
Sejda PDF Desktop.

1] PDF to XLS

PDF to XLS service

PDF to XLS is one of the best options for extracting tables from PDF. It has two features that make it handy. You can fetch tables from 20 PDF documents together. Also, the PDF table extraction is automatic. It generates the output as an XLSX file. If a PDF has multiple tables, then each table is stored separately in different sheets of output XLSX file.

Open the homepage of this service. After that, drag n drop PDF files or use UPLOAD FILES button. Each uploaded PDF is converted to XLSX format file automatically. When the output files are ready, you can download them one by one or download a ZIP file that will contain all the output files.

2] PDFtoExcel.com

Extract Tables from PDF documents

PDFtoExcel.com service can extract tables from one PDF at once but it supports multiple platforms to upload PDF. It supports OneDrive, desktop, Google Drive, and Dropbox platforms to upload a PDF. Also, the conversion process is automatic.

This service homepage is here. There, select an upload option to add PDF. After that, it automatically uploads and converts PDF to Excel (XLSX) file. When the output is ready, you will get the download link to save the output file containing PDF table(s).

Note: Though this service mentions that it can extract tables from scanned PDF files also, it didn’t work for me. You can still try it for scanned PDF.

3] Tabula

Tabula PDF extractor software

Tabula is a powerful software that can automatically detect tables present in a PDF and then lets you save those tables as TSV, JSON, or CSV file. You can select the option to save separate CSV files for each PDF table or save all the tables in a single CSV file.

To download this open-source PDF table extractor, click here. It also requires Java to run and use it successfully.

Extract the ZIP file that you downloaded, and run tabula.exe file. It will open a page in your default browser. If the page is not opened, then add http://localhost:8080 in your browser and press Enter.

Now you will see its interface where you can use the Browse option to add a PDF. After that, press Import button. When the PDF is added, you can see PDF pages on its interface.

Use Autodetect Tables button and it will highlight all the tables present in that PDF automatically. You can also manually highlight a table by selecting a particular table. If you want, you can also remove selected tables of your choice.

This will help you save only those tables that you want. When PDF tables are highlighted, click on the Preview & Export Extracted Data button.

Finally, use the drop-down menu available on the top part to select an output format, and press Export button. This will save PDF tables in the output format file selected by you.

4] ByteScout PDF Multitool

ByteScout PDF Multitool- add pdf and detect tables

As the name suggests, this software comes with multiple tools. It has tools such as convert PDF to multipage TIFF, rotate PDF document, make PDF unsearchable, optimize PDF, add an image to PDF, and more. PDF table detector feature is also there which is pretty awesome. The advantage of this tool is you can extract tables from scanned PDF too. You can detect tables in multiple pages and then extract those tables as CSV, XLS, XML, TXT, or JSON format file. Before extraction, it also lets you set a page range to extract tables from specified pages only.

You can grab this software here. It is free for non-commercial use only. After installation, run this software and use Open Document option to add a PDF. After that, click on the Detect tables tool as highlighted in the image above. This tool is present under Data Extraction category.

It will open a box where you can set conditions to detect tables. For example, you can set a minimum number of columns, rows, minimum line breaks between tables, set table detection mode to bordered or borderless table, etc. Use options or keep the default settings.

After that, press Detect next table button in that box. It will identify and select a table on the current page. In that way, you can move to another page and detect more tables.

detect tables and save pdf tables with selected output

When you’re done, use Proceed to extraction button, and select the output format. Finally, you can use options to save the tables from the current page or define a page range, and save the output.

The tool gives a satisfactory output. But sometimes, it might detect other content in PDF, and might not be able to extract tables from multiple pages. In that case, you should use it to fetch and save tables one by one.

5] Sejda PDF Desktop

Sejda PDF Desktop with pdf to excel converter

Sejda PDF Desktop is also a multipurpose software. It can optimize or compress PDF, add watermark to PDF, remove restrictions from PDF, edit PDF document, etc. However, its free plan has limitations. In the free plan, only 3 tasks per day can be done. Also, the PDF size limit is 50 MB or 10 pages.

You can use its PDF to Excel conversion tool to extract PDF tables. It automatically detects the tables in PDF pages and lets you save those tables as XLSX or CSV.

Its download link is here. After installation, use PDF to Excel tool from its main interface. After selecting that tool, use Choose PDF files button. Only one PDF can be added to the free plan.

When the PDF is added, it will provide Convert PDF to CSV and Convert PDF to Excel buttons. Use a button and then you can save the output to the desired location on your PC.

convert pdf table to excel or csv

Its PDF table detection tool is good. You don’t have to manually detect tables. Still, sometimes it might include other text content as a PDF table and store it in the output. But overall results are good.

That’s all.

These are some good tools to extract tables from PDF. Tabula software is more effective than other tools. Still, you can try all the tools and check which helps.

Similar reads:

Consuelo Diez

About the author

Soy ingeniero de windows, ios, pdf, errores, gadgets con más de 10 años de experiencia. He trabajado en muchas aplicaciones y marcos de trabajo de Windows de alta calidad, como OneDrive for Business, Office 365 y más. Mi trabajo reciente ha incluido el desarrollo del lector de pdf para la plataforma Windows y trabajar para que los mensajes de error sean más claros para los usuarios. Además, he estado involucrado en el desarrollo de la plataforma ios durante algunos años y estoy muy familiarizado con sus características y peculiaridades.

Cómo extraer tablas de documentos PDF

Extraer tablas de documentos PDF

1] PDF a XLS

2] PDFtoExcel.com

3] Tabula

4] ByteScout PDF Multiherramienta

5] Sejda PDF Escritorio

How to extract Tables from PDF documents

Extract Tables from PDF documents

1] PDF to XLS

2] PDFtoExcel.com

3] Tabula

4] ByteScout PDF Multitool

5] Sejda PDF Desktop

Consuelo Diez

About the author

Related posts

Document Converter: Convert DOC, PDF, DOCX, RTF, TXT, HTML Archivos

¿Qué es un PPS file? ¿Cómo convertir PPS a PDF en Windows 11/10?

PDF text desaparece al editar o guardar archivo en Windows 10

Best Free PDF Editor Online Tools que están basados en la nube

PDF Editor Online Tool libre de editar archivos PDF - PDF Yeah

PDFTK Builder es un software gratuito para modificar y dividir los archivos y documentos PDF

¿Cómo extraer Highlighted Text de PDF como Text File?

Best Free PDF Editing software para Windows 10

Cómo convertir Documents a PDF con Google Docs usando un navegador

Ver imágenes y documentos PDF rápidamente a través de PRIMA Rapid Image Viewer

Habilitar de dos páginas View mode para PDF en Google Chrome PDF Viewer

7-PDF Website Converter: Convert Web Pages A PDF

Convert JPG file A PDF Online Free

Download Adobe Reader app para Windows 10 de Microsoft Store

Cómo Convert Scanned PDF a realizar búsquedas PDF?

Best Free PDF Stamp Creator software para Windows 10

Optimize, compresa y reduce la PDF File size en Windows 10

Cómo convertir PDF a MOBI en Windows 10

Cómo Highlight Text EN PDF Documentos en Microsoft Edge browser

Best PDF word counter software and online tools

Cómo extraer tablas de documentos PDF

Extraer tablas de documentos PDF

1] PDF a XLS

2] PDFtoExcel.com

3] Tabula

4] ByteScout PDF Multiherramienta

5] Sejda PDF Escritorio

How to extract Tables from PDF documents

Extract Tables from PDF documents

1] PDF to XLS

2] PDFtoExcel.com

3] Tabula

4] ByteScout PDF Multitool

5] Sejda PDF Desktop

Consuelo Diez

About the author

Related posts

Document Converter: Convert DOC, PDF, DOCX, RTF, TXT, HTML Archivos

¿Qué es un PPS file? ¿Cómo convertir PPS a PDF en Windows 11/10?

PDF text desaparece al editar o guardar archivo en Windows 10

Best Free PDF Editor Online Tools que están basados ​​en la nube

PDF Editor Online Tool libre de editar archivos PDF - PDF Yeah

PDFTK Builder es un software gratuito para modificar y dividir los archivos y documentos PDF

¿Cómo extraer Highlighted Text de PDF como Text File?

Best Free PDF Editing software para Windows 10

Cómo convertir Documents a PDF con Google Docs usando un navegador

Ver imágenes y documentos PDF rápidamente a través de PRIMA Rapid Image Viewer

Habilitar de dos páginas View mode para PDF en Google Chrome PDF Viewer

7-PDF Website Converter: Convert Web Pages A PDF

Convert JPG file A PDF Online Free

Download Adobe Reader app para Windows 10 de Microsoft Store

Cómo Convert Scanned PDF a realizar búsquedas PDF?

Best Free PDF Stamp Creator software para Windows 10

Optimize, compresa y reduce la PDF File size en Windows 10

Cómo convertir PDF a MOBI en Windows 10

Cómo Highlight Text EN PDF Documentos en Microsoft Edge browser

Best PDF word counter software and online tools

Best Free PDF Editor Online Tools que están basados en la nube