Publications ›› Papers
All
Data Analytics
Data Policy
Data Privacy

Ethics and Fair Use Framework for Privacy Preserving Data Sharing

Authors: Bita Afsharinia, Anjula Gurtoo, Jyotirmoy Dutta, Minnu Malieckal Download PDF The study aims to critically evaluate current privacy-preserving technologies and ethical frameworks in data sharing, identifying gaps and proposing a comprehensive, integrated ethical framework. Existing frameworks often fall short in integrating these two aspects effectively, particularly in the context of emerging technologies. The study […]

Privacy-Preserving Data Sharing: A Comprehensive Literature Review

Authors: Bita Afsharinia, Anjula Gurtoo, Jyotirmoy Dutta, Minnu Malieckal Download PDF This report provides an overview of the key concepts and techniques in privacy-preserving data analysis, emphasizing the importance of differential privacy, data anonymization, and data perturbation. The privacy-preserving principles of autonomy, justice, non-maleficence, beneficence, and explicability are critical for maintaining ethical standards in data […]

Data Governance: Some Pertinent Points for Discussion

Authors: Anjula Gurtoo, Jyotirmoy Dutta Download PDF The data governance panel during the 24 November 2023 Symposium on Data for Public Good at the Indian Institute of Science debated pertinent issues in the quest for a pragmatic policy framework for data governance. This document details and collates the views of different stakeholders.

Playbook for Data Quality Management for Geospatial Data

Authors: Minnu Malieckal, Anjula Gurtoo, Jyotirmoy Dutta, Linda Theres B, Sandeep P Download PDF This document is intended as a playbook for the geospatial sector to better understand what is meant by data, its use and how to manage its quality. The playbook details domains and datasets in geospatial sector, types of data, data formats, […]

Optimal-Tree-Based-Mechanisms-img

Optimal Tree-Based Mechanisms for Differentially Private Approximate CDFs

Authors: V. A. Rameshwar, A. Tandon, and A. Sharma Know more This paper considers the epsilon-differentially private (DP) release of an approximate cumulative distribution function (CDF) of the samples in a dataset. We assume that the true (approximate) CDF is obtained after lumping the data samples into a fixed number K of bins. In this […]

Enhancing-MOTION2NX-for-img

Enhancing MOTION2NX for Efficient, Scalable and Secure Image Inference using Convolutional Neural Networks

Authors: H. Kallamadi, R. Burra, S. Mittal, S. Sharma, A. Venkatesh, A. Tandon Know more This work contributes towards the development of an efficient and scalable open-source Secure Multi-Party Computation (SMPC) protocol on machines with moderate computational resources. We use the ABY2.0 SMPC protocol implemented on the C++ based MOTION2NX framework for secure convolutional neural […]

Performance Evaluation of Geospatial Images based on Zarr and Tiff

Authors: Jaheer khan, Swarup E, Rakshit Ramesh Know more This evaluates the performance of geospatial image processing using two distinct data storage formats: Zarr and TIFF. Geospatial images, converted to numerous applications like environmental monitoring, urban planning, and disaster management. Traditional Tagged Image File Format is mostly used because it is simple and compatible but […]

Cross Pseudo Supervision Framework for Sparsely Labelled Geospatial Images

Authors: Yash Dixit, Naman Srivastava, Joel D Joy, Rohan Olikara, Swarup E, Rakshit Ramesh Know more Land Use Land Cover (LULC) mapping is a vital tool for urban and resource planning, playing a key role in the development of innovative and sustainable cities. This study introduces a semi-supervised segmentation model for LULC prediction using high-resolution […]

An Atmospheric Correction Integrated LULC Segmentation Model for High-Resolution Satellite Imagery

Authors: Soham Mukherjeea, Yash Dixit, Naman Srivastava, Joel D Joy, Koesha Sinha, Rohan Olikara, Swarup E, Rakshit Ramesh Know more The integration of fine-scale multispectral imagery with deep learning models has revolutionized land use and land cover (LULC) classification. However, the atmospheric effects present in Top-of-Atmosphere sensor measured Digital Number values must be corrected to […]

Enhanced ETA Predictions with T-GCN on Optimized Road Segments

Authors: Shivika Sharma, Nandini Mawane, Chetan Kumar Kuraganti, Dhruthick Gowda M, Mayur Taware, Yash Chandrashekhar Dixit, Sahil Mishra, Raghu Krishnapuram and Rakshit Ramesh Know more Accurate Estimated Time of Arrival (ETA) predictions play a critical role in improving the operational efficiency, traffic management, and reliability of services in transit systems. This work presents a comprehensive […]

Data-Driven Solid Waste Management Optimization with Vehicle Routing and Redistribution Strategy

Authors: Chetan Kumar Kuraganti, Arun Josephraj, Saanidhya Vats, Amarthya KR, Koesha Sinha, Raghu Krishnapuram and Rakshit Ramesh Know more This paper presents two techniques for optimizing solid waste management (SWM) systems of a city. The first technique involves planning routes for garbage collection vehicles which utilizes the Capacitated Vehicle Routing Problem (CVRP) and Google-OR-Tools to […]

Comparison of Segmentation Methods in Remote  Sensing for Land Use Land Cover 

Authors: Naman Srivastava, Joel D Joy,Yash Dixit, Swarup E, Rakshit Ramesh Know more Land Use Land Cover (LULC) mapping is essential for urban and resource planning, and is one of the key elements in developing smart and sustainable this http URL study evaluates advanced LULC mapping techniques, focusing on Look-Up Table (LUT)-based Atmospheric Correction applied to Cartosat […]

Performance Evaluation of Geospatial Images based on Zarr and Tiff

Authors: Jaheer khan, Swarup E, Rakshit Ramesh Know more This evaluate the performance of geospatial image processing using two distinct data storage formats: Zarr and TIFF. Geospatial images, converted to numerous applications like environmental monitoring, urban planning, and disaster management. Traditional Tagged Image File Format is mostly used because it is simple and compatible but […]

Performance Evaluation of Geospatial Images based on Zarr and Tiff

Authors: Jaheer khan, Swarup E, Rakshit Ramesh Know more This evaluates the performance of geospatial image processing using two distinct data storage formats: Zarr and TIFF. Geospatial images, converted to numerous applications like environmental monitoring, urban planning, and disaster management. Traditional Tagged Image File Format is mostly used because it is simple and compatible but […]

Cross Pseudo Supervision Framework for Sparsely Labelled Geospatial Images

Authors: Yash Dixit, Naman Srivastava, Joel D Joy, Rohan Olikara, Swarup E, Rakshit Ramesh Know more Land Use Land Cover (LULC) mapping is a vital tool for urban and resource planning, playing a key role in the development of innovative and sustainable cities. This study introduces a semi-supervised segmentation model for LULC prediction using high-resolution […]

An Atmospheric Correction Integrated LULC Segmentation Model for High-Resolution Satellite Imagery

Authors: Soham Mukherjeea, Yash Dixit, Naman Srivastava, Joel D Joy, Koesha Sinha, Rohan Olikara, Swarup E, Rakshit Ramesh Know more The integration of fine-scale multispectral imagery with deep learning models has revolutionized land use and land cover (LULC) classification. However, the atmospheric effects present in Top-of-Atmosphere sensor measured Digital Number values must be corrected to […]

Enhanced ETA Predictions with T-GCN on Optimized Road Segments

Authors: Shivika Sharma, Nandini Mawane, Chetan Kumar Kuraganti, Dhruthick Gowda M, Mayur Taware, Yash Chandrashekhar Dixit, Sahil Mishra, Raghu Krishnapuram and Rakshit Ramesh Know more Accurate Estimated Time of Arrival (ETA) predictions play a critical role in improving the operational efficiency, traffic management, and reliability of services in transit systems. This work presents a comprehensive […]

Data-Driven Solid Waste Management Optimization with Vehicle Routing and Redistribution Strategy

Authors: Chetan Kumar Kuraganti, Arun Josephraj, Saanidhya Vats, Amarthya KR, Koesha Sinha, Raghu Krishnapuram and Rakshit Ramesh Know more This paper presents two techniques for optimizing solid waste management (SWM) systems of a city. The first technique involves planning routes for garbage collection vehicles which utilizes the Capacitated Vehicle Routing Problem (CVRP) and Google-OR-Tools to […]

Automatable Data Quality Dimensions for Data Exchange: Formulation and Application

Authors: Debarun Sengupta, Anjula Gurtoo, Minnu Malieckal, Jyotirmoy Dutta Download PDF Large amounts of data get generated and applied in decision making to improve outcomes. However, quality of the data remains an issue as data gets generated from varied sources, in unspecified formats, and variables vary across different types of data. Identifying key quality dimensions […]

Ethics and Fair Use Framework for Privacy Preserving Data Sharing

Authors: Bita Afsharinia, Anjula Gurtoo, Jyotirmoy Dutta, Minnu Malieckal Download PDF The study aims to critically evaluate current privacy-preserving technologies and ethical frameworks in data sharing, identifying gaps and proposing a comprehensive, integrated ethical framework. Existing frameworks often fall short in integrating these two aspects effectively, particularly in the context of emerging technologies. The study […]

Privacy-Preserving Data Sharing: A Comprehensive Literature Review

Authors: Bita Afsharinia, Anjula Gurtoo, Jyotirmoy Dutta, Minnu Malieckal Download PDF This report provides an overview of the key concepts and techniques in privacy-preserving data analysis, emphasizing the importance of differential privacy, data anonymization, and data perturbation. The privacy-preserving principles of autonomy, justice, non-maleficence, beneficence, and explicability are critical for maintaining ethical standards in data […]

Data Governance: Some Pertinent Points for Discussion

Authors: Anjula Gurtoo, Jyotirmoy Dutta Download PDF The data governance panel during the 24 November 2023 Symposium on Data for Public Good at the Indian Institute of Science debated pertinent issues in the quest for a pragmatic policy framework for data governance. This document details and collates the views of different stakeholders.

Playbook for Data Quality Management for Geospatial Data

Authors: Minnu Malieckal, Anjula Gurtoo, Jyotirmoy Dutta, Linda Theres B, Sandeep P Download PDF This document is intended as a playbook for the geospatial sector to better understand what is meant by data, its use and how to manage its quality. The playbook details domains and datasets in geospatial sector, types of data, data formats, […]

On the Optimal Number of Grids for Differentially Private Non-Interactive K-Means Clustering

On the Optimal Number of Grids for Differentially Private Non-Interactive K-Means Clustering Authors: Gokularam M, Anshoo Tandon Differentially private K-means clustering enables releasing cluster centers derived from a dataset while protecting the privacy of the individuals. Non-interactive clustering techniques based on privatized histograms are attractive because the released data synopsis can be reused for other downstream […]

SKALD: Scalable K-Anonymisation for Large Datasets 

Authors:K. Reddy, N. Chakraborty, A. Dharmavaram, A. Tandon Know more Data privacy and anonymisation are critical concerns in today’s data-driven society, particularly when handling personal and sensitive user data. Regulatory frameworks worldwide recommend privacy-preserving protocols such as k-anonymisation to de-identify releases of tabular data. Available hardware resources provide an upper bound on the maximum size […]

Improving the Privacy Loss Under User-Level DP Composition for Fixed Estimation Error

Authors:V. A. Rameshwar and A. Tandon Know more This paper considers the private release of statistics of several disjoint subsets of a datasets. In particular, we consider the epsilon-user-level differentially private release of sample means and variances of sample values in disjoint subsets of a dataset, in a potentially sequential manner. Traditional analysis of the privacy […]

ℓ, 𝛿)-Diversity: Linkage-Robustness via a Composition Theorem

Authors:V. A. Rameshwar and A. Tandon Know more In this paper, we consider the problem of degradation of anonymity upon linkages of anonymized datasets. We work in the setting where an adversary links together tgeq 2 anonymized datasets in which a user of interest participates, based on the user’s known quasi-identifiers, which motivates the use of ell-diversity as […]

Bounding User Contributions for User-Level Differentially Private Mean Estimation

Authors: V. Arvind Rameshwar (IIT Madras) and Anshoo Tandon Know more We revisit the problem of releasing the sample mean of bounded samples in a dataset, privately, under user-level ε-differential privacy (DP). We aim to derive the optimal method of preprocessing data samples, within a canonical class of processing strategies, in terms of the estimation error. […]

On the Optimal Number of Grids for Differentially Private Non-Interactive K-Means Clustering – Data Privacy

Authors: Gokularam M, Anshoo Tandon Know more Differentially private K-means clustering enables releasing cluster centers derived from a dataset while protecting the privacy of the individuals. Non-interactive clustering techniques based on privatized histograms are attractive because the released data synopsis can be reused for other downstream tasks without additional privacy loss. The choice of the […]

Breaking Data Silos: How GDI is Transforming Access to Geospatial Information in India

Breaking Data Silos: How GDI is Transforming Access to Geospatial Information in India

Authors: Bryan Paul Robert, Mahidhar Chellamani, Jyotirmoy Dutta Know more For years, some of India’s most valuable geospatial datasets remained scattered across government departments, research institutes, or private organizations. They held immense potential to transform logistics, strengthen climate resilience, and support smarter urban planning, but they remained difficult to access, buried in different formats and […]

Adaptive Self-Distillation for Minimizing Client Drift in Heterogeneous Federated Learning

Authors: M. Yashwanth, G. K. Nayak, A. Singh, Y. Simmhan, A. Chakraborty Know more Federated Learning (FL) is a machine learning paradigm that enables clients to jointly train a global model by aggregating the locally trained models without sharing any local training data. In practice, there can often be substantial heterogeneity (e.g., class imbalance) across […]

Privacy-Preserving Data Quality Assessment for Time-Series IoT Sensors

Authors: N. Chakraborty, A. Sharma, J. Dutta. H. D. Kumar Know more This paper proposes a novel framework for automated, objective, and privacy-preserving data quality assessment of time-series data from IoT sensors deployed in smart cities. We leverage custom, autonomously computable metrics that parameterise the temporal performance and adherence to a declarative schema document to […]

Optimal-Tree-Based-Mechanisms-img

Optimal Tree-Based Mechanisms for Differentially Private Approximate CDFs

Authors: V. A. Rameshwar, A. Tandon, and A. Sharma Know more This paper considers the epsilon-differentially private (DP) release of an approximate cumulative distribution function (CDF) of the samples in a dataset. We assume that the true (approximate) CDF is obtained after lumping the data samples into a fixed number K of bins. In this […]

Enhancing-MOTION2NX-for-img

Enhancing MOTION2NX for Efficient, Scalable and Secure Image Inference using Convolutional Neural Networks

Authors: H. Kallamadi, R. Burra, S. Mittal, S. Sharma, A. Venkatesh, A. Tandon Know more This work contributes towards the development of an efficient and scalable open-source Secure Multi-Party Computation (SMPC) protocol on machines with moderate computational resources. We use the ABY2.0 SMPC protocol implemented on the C++ based MOTION2NX framework for secure convolutional neural […]