docxtractr: Extract Data Tables and Comments from 'Microsoft' 'Word' Documents

'Microsoft Word' 'docx' files provide an 'XML' structure that is fairly straightforward to navigate, especially when it applies to 'Word' tables and comments. Tools are provided to determine table count/structure, comment count and also to extract/clean tables and comments from 'Microsoft Word' 'docx' documents. There is also nascent support for '.doc' and '.pptx' files.

Version: 0.6.5
Depends: R (≥ 3.6.0)
Imports: tools, xml2, purrr, dplyr, utils, httr, magrittr
Suggests: covr, tinytest
Published: 2020-07-05
Author: Bob Rudis ORCID iD [aut, cre], Mark Dulhunty [ctb], Karlo Guidoni-Martins [ctb], Chris Muir [aut, ctb], John Muschelli [ctb]
Maintainer: Bob Rudis <bob at rud.is>
BugReports: https://gitlab.com/hrbrmstr/docxtractr/issues
License: MIT + file LICENSE
URL: http://gitlab.com/hrbrmstr/docxtractr
NeedsCompilation: no
SystemRequirements: LibreOffice (<https://www.libreoffice.org/>) required to extract data from .doc files or perform .pptx conversion.
Materials: NEWS
CRAN checks: docxtractr results

Documentation:

Reference manual: docxtractr.pdf

Downloads:

Package source: docxtractr_0.6.5.tar.gz
Windows binaries: r-devel: docxtractr_0.6.5.zip, r-release: docxtractr_0.6.5.zip, r-oldrel: docxtractr_0.6.5.zip
macOS binaries: r-release (arm64): docxtractr_0.6.5.tgz, r-oldrel (arm64): docxtractr_0.6.5.tgz, r-release (x86_64): docxtractr_0.6.5.tgz
Old sources: docxtractr archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=docxtractr to link to this page.