Exploring your remote data in a breeze

Instantly find and access all your cloud data

In the previous months our colleagues from the Falnama project have collected a staggering amount of more than 6000 data files that we need to explore, process and visualize. Within the project folder structure of our Nextcloud storage we have now created a special folder called ‘OPENDATA’ with permissions for you to read the contents. Let’s see what is inside!

With our python package fairdatanow we can programmatically explore the contents of this cloud storage folder. Under the hood the fairdatanow makes use of the Nextcloud Python Framework nc-py-api for communication with the cloud server, and the powerful Python packages polars, itables and anywidget and for creating interactive tables.

Now in order the get started we need the configuration details to establish communication with our Nextcloud server we have created a an ‘OPENDATA’ folder with a read-only account that you can try out yourselves:

configuration = {
    'url': "https://laboppad.nl/falnama-project",
    'user':    "FALNAMA-OPENDATA",
    'password': "FALNAMA-WELCOME"
}

To get started you need to import the RemoteData class from the package.

from fairdatanow import RemoteData

And instantiate it with the configuration dictionary. Depending on the amount of files in the cloud storage it might take some time to build the interactive file table.

remote_data = RemoteData(configuration)
Please wait while scanning all file paths in remote folder...
Ready building file table for 'falnama-project', Total number of files and directories: 375   

We can now have a look at the contents of the cloud folder using the RemoteData.itable attribute. This will create an interactive table.

remote_data.itable
Loading ITables v2.4.3 from the init_notebook_mode cell... (need help?)

We can now filter and subsequently select rows by Shift and Ctrl clicking from this interactive table. Rows that are selected will be colored blue. These selected files can then be downloaded with the .download_selected() method onto your local machine into a local cache folder. Downloading is skipped if the selected files are already present locally.

remote_data.download_selected()
Ready with downloading 1 selected remote files to local cache: /home/frank/.cache/falnama-project/OPENDATA/maxrf/datastacks/WM-71803-01_400_600_50.datastack                                                                      

Now I am curious to take a look at the data! If you are curious too, read the next section…

FUNCTIONS


source

RemoteData

 RemoteData (configuration)

Recursively scan the contents of a remote webdav server as specified by configuration.