fairdatanow
  1. Exploring your remote data in a breeze
  • Welcome to fairdatanow
  • Getting started with Python
  • Exploring your remote data in a breeze
  • Exploring your remote data with tabulator
  • An image processing example
  • iframes

Exploring your remote data in a breeze

Instantly find and access all your Nextcloud data files

In order to to make use of fairdatanow, you will need a Nextcloud server url, a user name and a password to access a folder on a Nextcloud server. The recommended way to use these credentials in Jupyter notebooks is to store username and password as environment variables on your system. You can retrieve them with the os.getenv() function.

Note

In this way you avoid typing them directly in the notebook, which is not save if you need to share these notebooks with others.

To get started you need to import the RemoteData class from the package and instantiate it with the Nextcloud configuration dictionary. Depending on the amount of files in the cloud storage it might take some time to build the interactive file table.

If you know already for which files you are looking, you can provide an optional regular expression search string search_regex= as an argument. You can adapt this search string in the interactive table to obtain another filtering.

from fairdatanow import RemoteData
import os
configuration = {
    'url': "https://laboppad.nl/falnama-project", 
    'user':    os.getenv('NC_AUTH_USER'),
    'password': os.getenv('NC_AUTH_PASS')
}
remote_data = RemoteData(configuration)

We can now have a look at the contents of the cloud folder using the RemoteData.listdir() method. This will create an interactive table with all project files. If we already know better what we are looking for we can shorten the table by providing a regular expression search string search_regex= as an optional argument. As an example, let’s walk through the process of locating a bunch of xray tif files that the Rijksmuseum created for us.

remote_data.listdir(search_regex='xray')
Please wait while scanning all file paths in remote folder...
Ready building file table for 'falnama-project'
Total number of files and directories: 6342
Total size of the files: 194.8 GiB
Loading ITables v2.4.3 from the init_notebook_mode cell... (need help?)

If we scroll through this first selection of 209 entries we see that the interactive table contains all kinds of files related to the x-ray images. Using the Custom Search Builder and/or adjusting the regular expression in the search bar, we can interactively narrow down the filter to shown only the specific files we currently need. You can try this yourself.

Loading ITables v2.4.3 from the init_notebook_mode cell... (need help?)

It turns out that with search_regex='edited.tif' we obtain all 28 tif files that we need for further processing. They contain top halves and bottom halves for 14 pages that were imaged using x-rays.

We can now select rows by Shift and Ctrl clicking from this interactive table. Rows that are selected will be colored blue. These selected files can then be downloaded with the .download_selected() method onto your local machine into a local cache folder. Downloading is skipped if the selected files are already present locally. The local file paths in our cache folder are returned in the files list for further processing.

files = remote_data.download_selected()
Ready with downloading 28 selected remote files to local cache: /home/frank/.cache/fairdatanow                                                                      /edited pictures/71803-8_bottom_Falnama_grenz_2-2_edited.tif                                                          

Ok, we can now start working with this data to see if we can stitch these halves together. This is the topic of the next section.

Using the custom search builder

In some cases we might need a more powerful filter to precisely select the files that we need. Here is an example of such a predefined search query. See the DataTables documentation here for details on the syntax.

searchBuilder_rma_zips = {
    "preDefined": {
        "criteria": [
            {"data": "path", "condition": "contains", "value": ["RMA"]}, 
            {"data": "ext", "condition": "=", "value": [".zip"]}
        ]
    }
}
remote_data.listdir(search_regex='xray', searchBuilder=searchBuilder)
Please wait while scanning all file paths in remote folder...
Ready building file table for 'falnama-project'
Total number of files and directories: 6342
Total size of the files: 194.8 GiB
Loading ITables v2.4.3 from the init_notebook_mode cell... (need help?)

FUNCTIONS


source

RemoteData

 RemoteData (configuration)

Recursively scan the contents of a remote webdav server as specified by configuration.

Getting started with Python
Exploring your remote data with tabulator
 

2025 Rijkserfgoedlaboratorium Amsterdam | Made with nbdev

  • Report an issue