• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

LICENSES/H03-May-2022-

autotests/H03-May-2022-4,0522,902

cmake/H04-Dec-2021-374296

docs/H04-Dec-2021-119

po/H04-Dec-2021-40,14533,665

src/H03-May-2022-9,3346,556

tests/H03-May-2022-7251

.git-blame-ignore-revsH A D04-Dec-202153 32

.gitignoreH A D04-Dec-2021311 3029

.gitlab-ci.ymlH A D04-Dec-2021284 75

.kde-ci.ymlH A D04-Dec-2021342 1311

.reviewboardrcH A D04-Dec-2021105 43

KF5FileMetaDataConfig.cmake.inH A D04-Dec-2021195 85

Messages.shH A D04-Dec-2021203 63

README.mdH A D04-Dec-20214.5 KiB8755

metainfo.yamlH A D04-Dec-2021383 2119

README.md

1# KFileMetaData
2
3## Introduction
4
5KFileMetaData provides a simple library for extracting the text and metadata
6from a number of different files. This library is typically used by file
7indexers to retrieve the metadata. This library can also be used by applications to write metadata.
8
9## Using the library
10
11In order to use the library you must implement your own `ExtractionResult`
12class. Instances of this class will be passed to every applicable plugin and
13they will populate with the information.
14
15For convenience a `SimpleResult` class has been provided which stores all the
16data in memory and allows it to be introspected later. Most clients *should*
17implement their own `ExtractionResult` as the data can get quite large when
18extracting the text content from very large files.
19
20The library also supports plugins that write back data.
21
22## Extracting Metadata from a file
23
24This requires us to create a `ExtractionPluginManger` class, fetch the extractor
25plugins which are applicable for that file, and then pass the instance of
26`ExtractionResult` to each Extractor.
27
28A simple test example called `dump.cpp` has been written.
29
30## Writing Metadata to a file
31
32This will require calling the `WriterCollection` class's fetchWriters() method with the mimetype of the file that needs to be written to. This method will return a list of writers, and to actually write metadata, a call to the write() method is required. The write() method accepts an instance of the `WriteData` class, which stores a mapping of the properties to be written and their values.
33
34## Writing a custom file extractor
35
36The Metadata is extracted with the help of Extraction Plugins. Each plugin
37provides a list of mimetypes that it supports, and implements the extraction
38function which extracts the data and fills it in an `ExtractionResult`.
39
40Most of the common file types are already provided by the library.
41
42Extractors should typically avoid implementing any logic themselves and should
43just be wrappers on top of existing libraries.
44
45## Writing a custom metadata writer
46
47The writeback framework uses an approach similar to the extraction framework. Each writer plugin supports a list of mimetypes and implements the write function that takes in a `WriteData` object as input.
48
49### Adding data into an `ExtractionResult`
50
51The ExtractionResult can be filled with (key, value) pairs and plain text. The
52keys in these pairs typically correspond to a predefinied property. The list
53of properties is defined in the `properties.h` header. Every plugin should
54use the properties defined in this header. If a required property is missing
55then it should be added to this framework.
56
57The ExtractionResult should also be given a list of types. These types are
58defined in the `types.h` header. The correspond to a higher level overview
59of the files which the user typically expects.
60
61## Writing an external plugin
62
63Extractors and Writers can also be written in other languages and installed into the system,
64and KFileMetaData will be able to find them and use them.
65
66An external plugin must be an independently executable file (a binary,
67script with a hashbang line with the executable permission set, a batch file or
68cmd script, etc). They must be located within libexec directory.
69
70KFileMetaData will wrap each external extractor with an instance of the `ExternalExtractor` class, and every writer with `ExternalWriter`. The application will be free to choose any of the plugins returned by `WriterCollection` or `ExtractorCollection`.
71
72Every external plugin will be placed within a directory in libexec/kf5/kfilemetadata/externalextractors. Every plugin shall have a manifest.json file that specifies the mimetypes that the plugin supports and the main executable file. A sample manifest file is located at src/writers/externalwriters/example/manifest.json.
73
74Both kinds of plugins accept the target file as an argument.
75
76### Writing an external extractor
77
78Extractors take JSON formatted input specifying the input mimetype, and return JSON output with the extracted properties. The JSON output also indicates any errors that might have occurred. Calls to the extractor are blocking, hence there is a time limit for how long they can run.
79
80### Writing an external writer
81
82Writers take in the properties to be changed via stdin and return JSON output with the success value of the write operation. Calls to the writerare blocking, hence there is a time limit for how long they can run.
83
84## Links
85- Mailing list: <https://mail.kde.org/mailman/listinfo/kde-devel>
86- IRC channel: #kde-devel on Libera Chat
87