ThesearetheproceedingsofthefourthbiennialconferenceintheIntelligentData Analysisseries. TheconferencetookplaceinCascais,Portugal,13-15September 2001. Thethemeofthisconferenceseriesistheuseofcomputersinintelligent waysindataanalysis,includingtheexplorationofintelligentprogramsfordata analysis. Dataanalytictoolscontinuetodevelop,drivenbythecomputerrevo- tion. Methodswhichwouldhaverequiredunimaginableamountsofcomputing power,andwhichwouldhavetakenyearstoreachaconclusion,cannowbe appliedwitheaseandvirtuallyinstantly. Suchmethodsarebeingdevelopedby avarietyofintellectualcommunities,includingstatistics,arti?cialintelligence, neuralnetworks,machinelearning,datamining,andinteractivedynamicdata visualization. Thisconferenceseriesseekstobringtogetherresearchersstudying theuseofintelligentdataanalysisinthesevariousdisciplines,tostimulate- teractionsothateachdisciplinemaylearnfromtheothers. Soastoencourage suchinteraction,wedeliberatelykepttheconferencetoasingletrackmeeting. Thismeantthat,ofthealmost150submissionswereceived,wewereableto selectonly23fororalpresentationand16forposterpresentation.
Inaddition tothesecontributedpapers,therewasakeynoteaddressfromDarylPregibon, invitedpresentationsfromKatharinaMorik,RolfBackhofen,andSunilRao,and aspecial'datachallenge'session,whereresearchersdescribedtheirattemptsto analyseachallengingdatasetprovidedbyPaulCohen. Thisacceptancerate enabledustoensureahighqualityconference,whilealsopermittingustop- videgoodcoverageofthevarioustopicssubsumedwithinthegeneralheading ofintelligentdataanalysis. Wewouldliketoexpressourthanksandappreciationtoeveryoneinvolved intheorganizationofthemeetingandtheselectionofthepapers. Itisthe behind-the-scenese?ortswhichensurethesmoothrunningandsuccessofany conference. Wewouldalsoliketoexpressourgratitudetothesponsors:Fundac" ,ao paraaCienciaeaTecnologia,Minist'eriodaCienciaedaTecnologia,Faculdade deCienciaseTecnologia,UniversidadeNovadeLisboa,Funda,c"aoCalousteG- benkianandIPEInvestimentoseParticipac" ,oesEmpresariais,S. A. September2001 FrankHo?mann DavidJ. Hand NiallAdams GabrielaGuimaraes DougFisher Organization IDA2001wasorganizedbythedepartmentofComputerScience,NewUniversity ofLisbon. ConferenceCommittee GeneralChair: DouglasFisher(VanderbiltUniversity,USA) ProgramChairs: DavidJ.
Hand(ImperialCollege,UK) NiallAdams(ImperialCollege,UK) ConferenceChair: GabrielaGuimaraes(NewUniversityofLisbon,Portugal) PublicityChair: FrankHoppner(Univ. ofAppl. SciencesEmden,Germany) PublicationChair: FrankHo?mann(RoyalInstituteofTechnology,Sweden) LocalChair: FernandoMoura-Pires(UniversityofEvora,Portugal) AreaChairs: RobertaSiciliano(UniversityofNaples,Italy) ArnoSiebes(CWI,TheNetherlands) PavelBrazdil(UniversityofPorto,Portugal) ProgramCommittee NiallAdams(ImperialCollege,UK) PieterAdriaans(Syllogic,TheNetherlands) RussellAlmond(EducationalTestingService,USA) ThomasBack(InformatikCentrumDortmund,Germany) RiccardoBellazzi(UniversityofPavia,Italy) MichaelBerthold(Tripos,USA) LiuBing(NationalUniversityofSingapore) PaulCohen(UniversityofMassachusetts,USA) PaulDarius(LeuvenUniversity,Belgium) FazelFamili(NationalResearchCouncil,Canada) DouglasFisher(VanderbiltUniversity,USA) KarlFroeschl(UniversityofVienna,Austria) AlexGammerman(RoyalHolloway,UK) AdolfGrauel(UniversityofPaderborn,Germany) GabrielaGuimaraes(NewUniversityofLisbon,Portugal) LawrenceO. Hall(UniversityofSouthFlorida,USA) FrankHo?mann(RoyalInstituteofTechnology,Sweden) AdeleHowe(ColoradoStateUniversity,USA) Klaus-PeterHuber(SASInstitute,Germany) DavidJensen(UniversityofMassachusetts,USA) JoostKok(LeidenUniversity,TheNetherlands) RudolfKruse(UniversityofMagdeburg,Germany) FrankKlawonn(UniversityofAppliedSciencesEmden,Germany) VIII Organization HansLenz(FreeUniversityofBerlin,Germany) DavidMadigan(Soliloquy,USA) RainerMalaka(EuropeanMediaLaboratory,Germany) HeikkiMannila(Nokia,Finland) FernandoMouraPires(UniversityofEvora,Portugal) SusanaNascimento(UniversityofLisbon,Portugal) WayneOldford(UniversityofWaterloo,Canada) AlbertPrat(TechnicalUniversityofCatalunya,Spain) PeterProtzel(TechnicalUniversityChemnitz,Germany) GiacomodellaRiccia(UniversityofUdine,Italy) RosannaSchiavo(UniversityofVenice,Italy) KaisaSere(AboAkademiUniversity,Finland) RobertaSiciliano(UniversityofNaples,Italy) RosariaSilipo(Nuance,USA) FloorVerdenius(ATO-DLO,TheNetherlands) StefanWrobel(UniversityofMagdeburg,Germany) HuiXiaoLiu(BrunelUniversity,UK) NevinZhang(HongKongUniversityofScienceandTechnology,HongKong) SponsoringInstitutions Fundac" ,aoparaaCienciaeaTecnologia,Minist'eriodaCienciaedaTecnologia FaculdadedeCienciaseTecnologia,UniversidadeNovadeLisboa Fundac" ,aoCalousteGulbenkian IPEInvestimentoseParticipac" ,15September 2001.
Thethemeofthisconferenceseriesistheuseofcomputersinintelligent waysindataanalysis,includingtheexplorationofintelligentprogramsfordata analysis. Dataanalytictoolscontinuetodevelop,drivenbythecomputerrevo- tion. Methodswhichwouldhaverequiredunimaginableamountsofcomputing power,andwhichwouldhavetakenyearstoreachaconclusion,cannowbe appliedwitheaseandvirtuallyinstantly. Suchmethodsarebeingdevelopedby avarietyofintellectualcommunities,includingstatistics,arti?cialintelligence, neuralnetworks,machinelearning,datamining,andinteractivedynamicdata visualization. Thisconferenceseriesseekstobringtogetherresearchersstudying theuseofintelligentdataanalysisinthesevariousdisciplines,tostimulate- teractionsothateachdisciplinemaylearnfromtheothers. Soastoencourage suchinteraction,wedeliberatelykepttheconferencetoasingletrackmeeting. Thismeantthat,ofthealmost150submissionswereceived,wewereableto selectonly23fororalpresentationand16forposterpresentation.
Inaddition tothesecontributedpapers,therewasakeynoteaddressfromDarylPregibon, invitedpresentationsfromKatharinaMorik,RolfBackhofen,andSunilRao,and aspecial'datachallenge'session,whereresearchersdescribedtheirattemptsto analyseachallengingdatasetprovidedbyPaulCohen. Thisacceptancerate enabledustoensureahighqualityconference,whilealsopermittingustop- videgoodcoverageofthevarioustopicssubsumedwithinthegeneralheading ofintelligentdataanalysis. Wewouldliketoexpressourthanksandappreciationtoeveryoneinvolved intheorganizationofthemeetingandtheselectionofthepapers. Itisthe behind-the-scenese?ortswhichensurethesmoothrunningandsuccessofany conference. Wewouldalsoliketoexpressourgratitudetothesponsors:Fundac" cao paraaCienciaeaTecnologia,Minist'eriodaCienciaedaTecnologia,Faculdade deCienciaseTecnologia,UniversidadeNovadeLisboa,Fundacc"aoCalousteG- benkianandIPEInvestimentoseParticipac" coesEmpresariais,S. A. September2001 FrankHo?mann DavidJ. Hand NiallAdams GabrielaGuimaraes DougFisher Organization IDA2001wasorganizedbythedepartmentofComputerScience,NewUniversity ofLisbon. ConferenceCommittee GeneralChair: DouglasFisher(VanderbiltUniversity,USA) ProgramChairs: DavidJ.
Hand(ImperialCollege,UK) NiallAdams(ImperialCollege,UK) ConferenceChair: GabrielaGuimaraes(NewUniversityofLisbon,Portugal) PublicityChair: FrankHoppner(Univ. ofAppl. SciencesEmden,Germany) PublicationChair: FrankHo?mann(RoyalInstituteofTechnology,Sweden) LocalChair: FernandoMoura-Pires(UniversityofEvora,Portugal) AreaChairs: RobertaSiciliano(UniversityofNaples,Italy) ArnoSiebes(CWI,TheNetherlands) PavelBrazdil(UniversityofPorto,Portugal) ProgramCommittee NiallAdams(ImperialCollege,UK) PieterAdriaans(Syllogic,TheNetherlands) RussellAlmond(EducationalTestingService,USA) ThomasBack(InformatikCentrumDortmund,Germany) RiccardoBellazzi(UniversityofPavia,Italy) MichaelBerthold(Tripos,USA) LiuBing(NationalUniversityofSingapore) PaulCohen(UniversityofMassachusetts,USA) PaulDarius(LeuvenUniversity,Belgium) FazelFamili(NationalResearchCouncil,Canada) DouglasFisher(VanderbiltUniversity,USA) KarlFroeschl(UniversityofVienna,Austria) AlexGammerman(RoyalHolloway,UK) AdolfGrauel(UniversityofPaderborn,Germany) GabrielaGuimaraes(NewUniversityofLisbon,Portugal) LawrenceO. Hall(UniversityofSouthFlorida,USA) FrankHo?mann(RoyalInstituteofTechnology,Sweden) AdeleHowe(ColoradoStateUniversity,USA) Klaus-PeterHuber(SASInstitute,Germany) DavidJensen(UniversityofMassachusetts,USA) JoostKok(LeidenUniversity,TheNetherlands) RudolfKruse(UniversityofMagdeburg,Germany) FrankKlawonn(UniversityofAppliedSciencesEmden,Germany) VIII Organization HansLenz(FreeUniversityofBerlin,Germany) DavidMadigan(Soliloquy,USA) RainerMalaka(EuropeanMediaLaboratory,Germany) HeikkiMannila(Nokia,Finland) FernandoMouraPires(UniversityofEvora,Portugal) SusanaNascimento(UniversityofLisbon,Portugal) WayneOldford(UniversityofWaterloo,Canada) AlbertPrat(TechnicalUniversityofCatalunya,Spain) PeterProtzel(TechnicalUniversityChemnitz,Germany) GiacomodellaRiccia(UniversityofUdine,Italy) RosannaSchiavo(UniversityofVenice,Italy) KaisaSere(AboAkademiUniversity,Finland) RobertaSiciliano(UniversityofNaples,Italy) RosariaSilipo(Nuance,USA) FloorVerdenius(ATO-DLO,TheNetherlands) StefanWrobel(UniversityofMagdeburg,Germany) HuiXiaoLiu(BrunelUniversity,UK) NevinZhang(HongKongUniversityofScienceandTechnology,HongKong) SponsoringInstitutions Fundac" caoparaaCienciaeaTecnologia,Minist'eriodaCienciaedaTecnologia FaculdadedeCienciaseTecnologia,UniversidadeNovadeLisboa Fundac" caoCalousteGulbenkian IPEInvestimentoseParticipac" coesEmpresariais,S.
A. TableofContents TheFourthInternationalSymposiumonIntelligentData Analysis FeatureCharacterizationinScienti?cDatasets...1 ElizabethBradley(UniversityofColorado),NancyCollins(University ofColorado),W. PhilipKegelmeyer(SandiaNationalLaboratories) RelevanceFeedbackintheBayesianNetworkRetrievalModel: AnApproachBasedonTermInstantiation...13 LuisM. deCampos(UniversityofGranada),JuanM. Fernan ' dez-Luna (UniversityofJa'en),JuanF. Huete(UniversityofGranada) GeneratingFuzzySummariesfromFuzzyMultidimensionalDatabases...24 AnneLaurent(Universit'ePierreetMarieCurie) AMixture-of-ExpertsFrameworkforLearningfromImbalancedData Sets...34 AndrewEstabrooks(IBM),NathalieJapkowicz(UniversityofOttawa) PredictingTime-VaryingFunctionswithLocalModels...44 AchimLewandowski(ChemnitzUniversity),PeterProtzel(Chemnitz University) BuildingModelsofEcologicalDynamicsUsingHMMBasedTemporal DataClustering-APreliminaryStudy...53 CenLi(TennesseeStateUniversity),GautamBiswas(Vanderbilt University),MikeDale(Gri?thUniversity),PatDale(Gri?th University) TaggingwithSmallTrainingCorpora...63 NunoC. Marques(UniversidadeAberta),GabrielPereiraLopes (Centria) ASearchEngineforMorphologicallyComplexLanguages...7
3 UdoHahn(UniversitatFreiburg),MartinHoneck(Universitat sklinikum Freiburg),StefanSchulz(Universitat oppner(Univ. ofAppl. SciencesEmden,Germany) PublicationChair: FrankHo?mann(RoyalInstituteofTechnology,Sweden) LocalChair: FernandoMoura-Pires(UniversityofEvora,Portugal) AreaChairs: RobertaSiciliano(UniversityofNaples,Italy) ArnoSiebes(CWI,TheNetherlands) PavelBrazdil(UniversityofPorto,Portugal) ProgramCommittee NiallAdams(ImperialCollege,UK) PieterAdriaans(Syllogic,TheNetherlands) RussellAlmond(EducationalTestingService,USA) ThomasBack(InformatikCentrumDortmund,Germany) RiccardoBellazzi(UniversityofPavia,Italy) MichaelBerthold(Tripos,USA) LiuBing(NationalUniversityofSingapore) PaulCohen(UniversityofMassachusetts,USA) PaulDarius(LeuvenUniversity,Belgium) FazelFamili(NationalResearchCouncil,Canada) DouglasFisher(VanderbiltUniversity,USA) KarlFroeschl(UniversityofVienna,Austria) AlexGammerman(RoyalHolloway,UK) AdolfGrauel(UniversityofPaderborn,Germany) GabrielaGuimaraes(NewUniversityofLisbon,Portugal) LawrenceO. Hall(UniversityofSouthFlorida,USA) FrankHo?mann(RoyalInstituteofTechnology,Sweden) AdeleHowe(ColoradoStateUniversity,USA) Klaus-PeterHuber(SASInstitute,Germany) DavidJensen(UniversityofMassachusetts,USA) JoostKok(LeidenUniversity,TheNetherlands) RudolfKruse(UniversityofMagdeburg,Germany) FrankKlawonn(UniversityofAppliedSciencesEmden,Germany) VIII Organization HansLenz(FreeUniversityofBerlin,Germany) DavidMadigan(Soliloquy,USA) RainerMalaka(EuropeanMediaLaboratory,Germany) HeikkiMannila(Nokia,Finland) FernandoMouraPires(UniversityofEvora,Portugal) SusanaNascimento(UniversityofLisbon,Portugal) WayneOldford(UniversityofWaterloo,Canada) AlbertPrat(TechnicalUniversityofCatalunya,Spain) PeterProtzel(TechnicalUniversityChemnitz,Germany) GiacomodellaRiccia(UniversityofUdine,Italy) RosannaSchiavo(UniversityofVenice,Italy) KaisaSere(AboAkademiUniversity,Finland) RobertaSiciliano(UniversityofNaples,Italy) RosariaSilipo(Nuance,USA) FloorVerdenius(ATO-DLO,TheNetherlands) StefanWrobel(UniversityofMagdeburg,Germany) HuiXiaoLiu(BrunelUniversity,UK) NevinZhang(HongKongUniversityofScienceandTechnology,HongKong) SponsoringInstitutions Fundac" caoparaaCienciaeaTecnologia,Minist'eriodaCienciaedaTecnologia FaculdadedeCienciaseTecnologia,UniversidadeNovadeLisboa Fundac" caoCalousteGulbenkian IPEInvestimentoseParticipac" coesEmpresariais,S.
A. TableofContents TheFourthInternationalSymposiumonIntelligentData Analysis FeatureCharacterizationinScienti?cDatasets...1 ElizabethBradley(UniversityofColorado),NancyCollins(University ofColorado),W. PhilipKegelmeyer(SandiaNationalLaboratories) RelevanceFeedbackintheBayesianNetworkRetrievalModel: AnApproachBasedonTermInstantiation...13 LuisM. deCampos(UniversityofGranada),JuanM. Fernan ' dez-Luna (UniversityofJa'en),JuanF. Huete(UniversityofGranada) GeneratingFuzzySummariesfromFuzzyMultidimensionalDatabases...24 AnneLaurent(Universit'ePierreetMarieCurie) AMixture-of-ExpertsFrameworkforLearningfromImbalancedData Sets...34 AndrewEstabrooks(IBM),NathalieJapkowicz(UniversityofOttawa) PredictingTime-VaryingFunctionswithLocalModels...44 AchimLewandowski(ChemnitzUniversity),PeterProtzel(Chemnitz University) BuildingModelsofEcologicalDynamicsUsingHMMBasedTemporal DataClustering-APreliminaryStudy...53 CenLi(TennesseeStateUniversity),GautamBiswas(Vanderbilt University),MikeDale(Gri?thUniversity),PatDale(Gri?th University) TaggingwithSmallTrainingCorpora...63 NunoC. Marques(UniversidadeAberta),GabrielPereiraLopes (Centria) ASearchEngineforMorphologicallyComplexLanguages...7
3 UdoHahn(UniversitatFreiburg),MartinHoneck(Universitat sklinikum Freiburg),StefanSchulz(Universitat sklinikumFreiburg) ErrorsDetectionandCorrectioninLargeScaleDataCollecting...84 RenatoBruni(Universit'adiRoma),AntonioSassano(Universit'adi Roma) X TableofContents ANewFrameworktoAssessAssociationRules ...95 FernandoBerzal(UniversityofGranada),IgnacioBlanco(University ofGranada),DanielS'anchez(UniversityofGranada), Mar'?a-AmparoVila(UniversityofGranada) CommunitiesofInterest ...105 CorinnaCortes(AT&TShannonResearchLabs),DarylPregibon (AT&TShannonResearchLabs),ChrisVolinsky(AT&TShannon ResearchLabs) AnEvaluationofGradingClassi?ers ...115 AlexanderK. Seewald(AustrianResearchInstituteforArti?cial Intelligence),JohannesFur nkranz(AustrianResearchInstitutefor Arti?cialIntelligence) FindingInformativeRulesinIntervalSequences ...125 FrankHoppner(UniversityofAppliedSciencesEmden), FrankKlawonn(UniversityofAppliedSciencesEmden) Correlation-BasedandContextualMerit-BasedEnsembleFeature Selection...135 SeppoPuuronen(UniversityofJyvaskyla),AlexeyTsymbal(University ofJyvaskyla),IrynaSkrypnyk(UniversityofJyvas kyla) NonmetricMultidimensionalScalingwithNeuralNetworks...145 MichielC. vanWezel(UniversiteitLeiden),WalterA.
Kosters (UniversiteitLeiden),PetervanderPutten(UniversiteitLeiden), JoostN. Kok(UniversiteitLeiden) FunctionalTreesforRegression...156 Joao " Gama(UniversityofPorto) DataMiningwithProductsofTrees...167 Jos'eTom'eA. S. Ferreira(ImperialCollege),DavidG. T. Denison (ImperialCollege),DavidJ. Hand(ImperialCollege) 3 SBagging:FastClassi?erInductionMethodwithSubsamplingand Bagging ...177 MasahiroTerabe(MitshibishiResearchInstitute,Inc. ),TakashiWashio (I. S. I. R. ,OsakaUniversity),HiroshiMotoda(I. S. I. R.