{"id":519,"date":"2025-05-10T13:37:53","date_gmt":"2025-05-10T10:37:53","guid":{"rendered":"https:\/\/panagiotis-filippakis.pro\/?p=519"},"modified":"2025-06-23T22:08:02","modified_gmt":"2025-06-23T19:08:02","slug":"condensing-multi-label-data-based-on-clustering","status":"publish","type":"post","link":"https:\/\/panagiotis-filippakis.pro\/index.php\/2025\/05\/10\/condensing-multi-label-data-based-on-clustering\/","title":{"rendered":"Condensing multi-label data based on Clustering"},"content":{"rendered":"<div id=\"pl-519\"  class=\"panel-layout\" ><div id=\"pg-519-0\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-519-0-0\"  class=\"panel-grid-cell\" ><div id=\"panel-519-0-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"0\" ><div\n\t\t\t\n\t\t\tclass=\"so-widget-sow-editor so-widget-sow-editor-base\"\n\t\t\t\n\t\t><h3 class=\"widget-title\">Condensing multi-label data based on Clustering<\/h3>\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p>A common approach to speed up the instance-based classifiers, while preserving accuracy, is to use a subset of the original training set. This process, known as condensing, reduces the training dataset<br \/>\nto a smaller, representative set. This is accomplished by employing a data reduction technique that selects the most relevant training instances. While many of these techniques have been successfully<br \/>\napplied to single-label classification, most are not appropriate for multi-label data, where instances can be associated with multiple classes. This paper proposes a simple data reduction technique for<br \/>\nmulti-label datasets utilizing \ud835\udc3e-means++ clustering. The proposed method selects instances near cluster centroids to form the condensing set, utilizing \ud835\udc3e-means++\u2019s initialization strategy to achieve<br \/>\nwidely spread initial centroids. Experiments conducted on nine datasets, combined with a statistical test, show that our approach achieves significant data reduction while maintaining high classifi-<br \/>\ncation accuracy.<\/p>\n<p>Link: <a href=\"https:\/\/doi.org\/10.1145\/3716554.3716591\" target=\"_blank\" rel=\"noopener\">https:\/\/doi.org\/10.1145\/3716554.3716591<\/a><\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>A common approach to speed up the instance-based classifiers, while preserving accuracy, is to use a subset of the original training set. This process, known as condensing, reduces the training dataset to a smaller, representative set. This is accomplished by employing a data reduction technique that selects the most relevant training instances. While many of&hellip;&nbsp;<a href=\"https:\/\/panagiotis-filippakis.pro\/index.php\/2025\/05\/10\/condensing-multi-label-data-based-on-clustering\/\" rel=\"bookmark\">\u03a0\u03b5\u03c1\u03b9\u03c3\u03c3\u03cc\u03c4\u03b5\u03c1\u03b1 &raquo;<span class=\"screen-reader-text\">Condensing multi-label data based on Clustering<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":535,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"neve_meta_sidebar":"","neve_meta_container":"","neve_meta_enable_content_width":"off","neve_meta_content_width":70,"neve_meta_title_alignment":"","neve_meta_author_avatar":"","neve_post_elements_order":"","neve_meta_disable_header":"","neve_meta_disable_footer":"","neve_meta_disable_title":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[8],"tags":[],"class_list":["post-519","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-publications"],"jetpack_featured_media_url":"https:\/\/panagiotis-filippakis.pro\/wp-content\/uploads\/2025\/05\/PCI24.jpg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/posts\/519","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/comments?post=519"}],"version-history":[{"count":4,"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/posts\/519\/revisions"}],"predecessor-version":[{"id":566,"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/posts\/519\/revisions\/566"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/media\/535"}],"wp:attachment":[{"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/media?parent=519"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/categories?post=519"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/panagiotis-filippakis.pro\/index.php\/wp-json\/wp\/v2\/tags?post=519"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}