Publications
An up-to-date list is available on Google Scholar.
Sun, Y., Shen, J., and Kwon, Y. (2024). 2D-OOB: Attributing Data Contribution through Joint Valuation Framework. Advances in Neural Information Processing Systems (NeurIPS 2024). [URL]. [GitHub].
Banks, D., Melo, G. D., Gong, S., Kwon, Y., and, Rudin, C. (2024). Data Scientists Discuss AI Risks and Opportunities. Harvard Data Science Review. [URL].
Wang, J.T., Yang, T., Zou, J., Kwon, Y., and Jia, R. (2024). Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits. International Conference on Machine Learning (ICML 2024). (selected for oral presentation, Top 1.5%). [URL].
Kwon, Y.*, Wu, E.*, Wu, K.*, and Zou, J. (2024). DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models. International Conference on Learning Representations. (ICLR 2024). [URL]. [GitHub].
Jiang, K.*, Liang, W.*, Zou, J. and Kwon, Y. (2023). OpenDataVal: a Unified Benchmark for Data Valuation. Advances in Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. [URL]. [Website]. [GitHub].
Kwon, Y. and Zou, J. (2023). Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value. International Conference on Machine Learning (ICML 2023). [URL]. [GitHub].
Liang, W.*, Mao, Y.*, Kwon, Y.*, Yang, X., and Zou, J. (2023). On the nonlinear correlation of ML performance between data subpopulations. International Conference on Machine Learning (ICML 2023). [URL]. [Website].
Kwon, Y., Ginart, T., and Zou, J. (2022). Competition over data: how does data purchase affect users? Transactions of Machine Learning Research (TMLR). [URL]. [GitHub].
Kwon, Y. and Zou, J. (2022). WeightedSHAP: analyzing and improving Shapley based feature attributions. Neural Information Processing Systems (NeurIPS 2022). [URL]. [GitHub].
Liang, W.*, Zhang, Y.*, Kwon, Y.*, Yeung, S., and Zou, J. (2022). Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning. Neural Information Processing Systems (NeurIPS 2022). [URL]. [Website]. [GitHub].
Parikh. V., …, Kwon, Y., …, and Ashley, E. (2022). Deconvoluting complex correlates of COVID19 severity with a multi-omic pandemic tracking strategy. Nature Communications. 13(1), 1-10. [URL].
Kwon, Y. and Zou, J.. (2022). Beta-Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning. Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022), PMLR 151:8780-8802. [URL]. [GitHub]. (selected for oral presentation, Top 2.6%).
Kwon, Y., Rivas, M.A., and Zou, J.. (2021). Efficient computation and analysis of distributional Shapley values. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021), PMLR 130:793-801. [URL]. [GitHub].
Ginart, A.A., Zhang, E., Kwon, Y., and Zou, J.. (2021). Competing AI: How does competition feedback affect machine learning. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021), PMLR 130:1693-1701. [URL]. [GitHub]. [Press].
Hwang, D., Yang, S., Kwon, Y., Lee, K., Lee, G., Jo, H., Yoon, S., and Ryu, S.. (2020). A comprehensive study on molecular supervised learning with graph neural networks. Journal of Chemical Information and Modeling: 5936-5945. [URL]. [GitHub].
Kwon, Y., Kim, W., Won, J.-H., and Paik, M.C.. (2020). Principled learning method for Wasserstein distributionally robust optimization with local perturbations. Proceedings of the 37th International Conference on Machine Learning (ICML 2020), PMLR 119:5567-5576. [URL]. [GitHub].
Kim, Y., Kwon, Y., Chang, H., and Paik, M.C.. (2020). Lipschitz Continuous Autoencoders in Application to Anomaly Detection. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020), PMLR 108:2507-2517. [URL]. [GitHub].
Kwon, Y., Kim, W., Sugiyama, M., and Paik, M.C.. (2020). Principled analytic formulation for positive-unlabeled learning via weighted integral probability metric. Machine Learning, 109 (3): 513-532. [URL]. [GitHub].
- Kwon, Y., Won, J.-H., Kim, B. J. and Paik, M.C.. (2020). Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation. Computational Statistics & Data Analysis, 142: 106816. [URL]. [GitHub].
- Conference version is accepted as an Oral presentation at International Conference on Medical Imaging with Deep Learning 2018 (MIDL 2018).
Ryu, S., Kwon, Y., and Kim, W.. (2019). Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification. Chemical Science, 10 (36): 8438-8446. [URL]. [GitHub].
Kim, Y., Kwon, Y., and Paik, M.C.. (2019). Valid oversampling schemes to handle imbalance. Pattern Recognition Letters, 125 (1): 661-667. [URL]. [GitHub].
Kwon, Y., Kim, J., Paik, M.C., and Kim, H.. (2018). A robust calibration-assisted method for linear mixed effects model under cluster-specific nonignorable missingness. Statistica Sinica, 28 (4): 1907-1928. [URL].
Winzeck S, …, Kwon, Y., et al. (2018). ISLES 2016 & 2017. Benchmarking Ischemic Stroke Lesion Outcome Prediction Based on Multispectral MRI. Frontiers in Neurology, 9, 679. [URL].
Choi, Y.*, Kwon, Y.*, Lee, H.*, Kim, B. J., Paik, M.C., and Won, J.-H.. (2017). Ensemble of Deep Convolutional Neural Networks for Prognosis of Ischemic Stroke. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2016. Lecture Notes in Computer Science, 10154: 231-243. [URL].
Kwon, Y., Choi, Y.-G., Park, T., Ziegler, A., and Paik, M.C.. (2017). Generalized estimating equations with stabilized working correlation structure. Computational Statistics & Data Analysis, 106: 1-11. [URL].
- Kim, J., Kwon, Y., Paik, M.C.. (2016). Calibrated propensity score method for survey nonresponse in cluster sampling. Biometrika, 103 (2): 461-473. [URL].
* indicates the authors contributed equally.