Skip to content

vams2krish/Image-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dataset Class

The Dataset class represents a potentially large set of elements in TensorFlow's tf.data API. It provides methods for creating, transforming, and iterating over datasets.

Key Features

  • Supports creation of datasets from various sources (tensors, generators, files, etc.)
  • Provides transformations like mapping, filtering, batching, shuffling etc.
  • Enables efficient input pipelines through prefetching and parallelism
  • Supports windowing, grouping, and other advanced operations
  • Allows saving/loading datasets

Important Methods

Creation Methods

  • from_tensor_slices(): Creates a dataset from a tensor or nested structure of tensors
  • from_generator(): Creates a dataset from a Python generator function
  • range(): Creates a dataset of a step-separated range of values
  • list_files(): Creates a dataset of filenames matching glob patterns

Transformation Methods

  • map(): Applies a function to each element
  • filter(): Filters elements based on a predicate
  • batch(): Combines consecutive elements into batches
  • shuffle(): Randomly shuffles elements
  • repeat(): Repeats the dataset a given number of times
  • take(): Takes a specified number of elements from the start
  • skip(): Skips a specified number of elements from the start

Iteration

  • __iter__(): Allows iterating over the dataset elements
  • as_numpy_iterator(): Returns an iterator over NumPy arrays

Other

  • cache(): Caches elements of the dataset
  • prefetch(): Prefetches elements to improve performance
  • apply(): Applies a custom transformation function
  • save(): Saves the dataset to disk
  • cardinality(): Returns the number of elements in the dataset

Usage Example

# Create dataset from tensor
dataset = tf.data.Dataset.from_tensor_slices([1, 2, 3])

# Apply transformations
dataset = dataset.map(lambda x: x * 2)
dataset = dataset.batch(2)

# Iterate over elements
for element in dataset:
    print(element)

The Dataset class forms the foundation of efficient data loading and preprocessing pipelines in TensorFlow, enabling scalable machine learning workflows.

About

Image Classification with Deep learning neural networks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors