This project implements a deep learning model to translate satellite imagery into corresponding map-like representations. Leveraging the power of Conditional Generative Adversarial Networks (cGANs), the model learns to synthesize visual map styles based on the underlying geographical features present in satellite photographs.
The primary goal is to generate stylized maps that highlight specific features (e.g., roads, buildings) derived from raw satellite data, offering a different visual perspective compared to standard satellite views or traditional handcrafted maps.
The core of this model is a Conditional Generative Adversarial Network (cGAN).
- Generator: The generator network takes a satellite image as input and attempts to produce a synthetic map image that corresponds to the input. Its architecture is typically based on an encoder-decoder structure (like a U-Net) to capture and reconstruct spatial information effectively.
- Discriminator: The discriminator network receives both the input satellite image and either a real map or the generated map. Its task is to distinguish between pairs of (satellite image, real map) and (satellite image, generated map).
- Conditioning: The "conditional" aspect means both the generator and discriminator are conditioned on the input satellite image, allowing the model to learn a mapping between the input satellite view and the output map view.
- Adversarial Training: The generator and discriminator are trained together in an adversarial manner. The generator tries to fool the discriminator by creating realistic-looking maps, while the discriminator gets better at detecting fake maps. This competition drives the generator to produce increasingly convincing output.
- Translates satellite images to map-like images.
- Utilizes a cGAN architecture for high-quality image-to-image translation.
- (Add specific features here, e.g., Handles different geographical terrains, Outputs specific map styles, Supports various input image sizes - if applicable)
During training, the cGAN is presented with pairs of aligned satellite images and their corresponding ground truth maps. The generator attempts to create a map from the satellite image, and the discriminator evaluates if this generated map looks like a real map given the original satellite image. Through backpropagation and the adversarial game, the generator learns to produce maps that are visually plausible and geographically accurate representations of the input satellite data.
For inference (using the trained model), a new satellite image is fed into the generator, which then outputs the translated map image.
This model was built and trained using tensorflow 2.x
This model requires a dataset consisting of paired images: a satellite image and its corresponding map representation for the same location.
The dataset used is: http://efrosgans.eecs.berkeley.edu/pix2pix/datasets/maps.tar.gz
- Prepare your dataset according to the 'Dataset' section.
- We create a custom training loop by creating a function known as train_step()
- The training is started by creating a with statement which initializes tf.GradientTape()
- We then take the output of the generator, discriminator(real_images,output generated by a generator)
- We calculate the loss of the generator and the discriminator and calculate the respective gradients(' We take the loss function has Binary Crossentropy')
- Apply the corresponding gradients to the optimizer ('The chosen optimizer is Adam')
Example 1: Here's the output given by this model: https://github.com/Anish-CodeDev/Maps_pix2pix/blob/main/output/output.jpg
(Both real and generated images are together)
- Inspired by the Pix2Pix cGAN paper: https://arxiv.org/pdf/1611.07004
- I refered https://www.tensorflow.org/tutorials/generative/pix2pix for the implementation of this model.
- Uses libraries: TensorFlow
- Dataset source: http://efrosgans.eecs.berkeley.edu/pix2pix/datasets/maps.tar.gz
If you have any questions, feel free to open an issue on this repository or contact me at my email address: [email protected]