Reading a dataset
Reading datasets using Petastorm can be very simple. In this section, we will demonstrate how we can easily load a Petastorm dataset into two frequently used deep learning frameworks, which are TensorFlow and PyTorch:
- To load our Petastorm datasets, we use the
petastorm.reader.Readerclass, which implements the iterator interface that allows us to use plain Python to go over the samples very efficiently. Thepetastorm.reader.Readerclass can be created using thepetastorm.make_readerfactory method:from petastorm import make_reader with make_reader('dfs://some_dataset') as reader: Â Â Â for sample in reader: Â Â Â Â Â Â Â print(sample.id) Â Â Â Â Â Â Â plt.imshow(sample.image1) - The following code example shows how we can stream a dataset into the TensorFlow
Examplesclass, which as we have seen before is a named tuple with the keys being the ones specified in the Unischema of...