Research Center for my thesis
Characterizing and Mining the citation graph
of computer science literature
Yuan An (Supervisors:Prof.Evangelos E.Milios, Prof. Jeannette Janssen)commenced on:Aug.20,2000
This topic involves characterizing the citation graph of computer science extracted from Citeseer in the same way that the Web has been characterized. Citeseer (or
ResearchIndex, http://citeseer.nj.nec.com/cs) is supposed to contain more than 50,000 computer science papers.
Their references form the citation graph, a graph
with unknown properties. It is much smaller than the Web, but it may not be complete (i.e. some references are not in the collection), and it can be extracted (in part
or in whole) by querying ResearchIndex. As a result, techniques used to characterize the Web, for example:
http://www.almaden.ibm.com/almaden/webmap_press.html,
http://www.almaden.ibm.com/cs/k53/www9.final/ and the references in the "Web Science" section of:
http://www.cs.dal.ca/~eem/webRobots.html may need to be extended in nontrivial ways for characterizing the citation graph. Variations of the problem include
characterizing the citation graph of a subarea of computer science, which may be possible to extract in full from ResearchIndex. The project will have some amount
of web programming (a necessary step but not the focus of the thesis), and some serious thought about time and space requirements and data structures for storing
the citation graph. A statistical sampling of the citation graph may have to be designed, in the same style as the Web papers above. The back end will be the
application of various graph metrics to characterize the graph.
keywords:Artificial Intelligence,Graph Theory,Machine Learning,
Web Search Engine,Citation Graph..
DB references
Information overload references
Cybermetrics
Bibliometrics
Google Toronto Rental
Hiring story
References in the Web Science in Dr.Milios's Web
Useful links in Dr.Milios's Web
Dr.Janssen's Web
Dr.Lawrence's Web
Citeseer
CORA search engine
IBM Research Almaden News:Researchers map the web
Graph structure in the web:IBM
Algorithms and Complexity
Dr.Kleinberg's webpage
perl archive
perl libwww-perl
Math Concepts
WWW Consortium
Visualising Web(a paper)
OMG
JAVA products
MSDN library
A Java development tools:Together
A development enviroment:Sniff+
A debugger:Metamata debugger
Free Software Foundation:Gnu
Gnu-unix workalike tools for Win
CVS source code control tool
IBM Alphaworks Jikes compiler
advanced JAVA for Enterprise App.
The Elements of Style:writing in English
Great books online
EJB,JSP
Java 2 Docs
Java Servlets's Doc
JNI doc
JNI tutorial
Forte's Doc
Bibliometrics of the World Wide Web
Dr.Ray R.Larson's webpage:School of Info. Sys.Manag.in UC.Berkley
Cybermetrics,bibliometrics,scientometrics
EUGENE GARFIELD, Ph.D.
The Collection of Computer Science Bibliographies
Researches of Barabasi
S. Redner's webpage
Reference on Zipf's law
Java Zone
IBM clever searching
70's book:information retrieval
Networked Computer Science Technical Reference Library
Publications of Graph and Application
DB2 V7 Text book
DB2 software
LEDA:graph algorithms
Latex commands
How to use Latex
Latex help 1.1
Math symbols in Latex
web trawling
Internet Requests for Comments(RFC)
Robots
CORA
Course:Information retrieval,digital libraries and the web
Dr.Giles's homepage
Econophysics biblio.
Java RegExp
Unicode regexp
LEDA sources code download site
LEDA guide
LEDA guide download
Java net tut
Java Net FAQ,JN FAQ
Good Java net FAQ
RFCs
Power law Distribution is Real and Virtual Worlds
Bibliography of CS
LEDA object
WEKA package
STL and Quick Reference
Information about C++
Best C++ practices
C++ slides
C++ examples
C reference
C/C++ Library reference
Code example from book:Professional Java server programming
STL programmer's guide
GTK+-the GIMP Tookkit
Complete FAQ List of JAVA Techs
C++ FAQ lite
C++libarary
CORBA FAQ
Tutorial on CORBA
Popular FAQ
Stanford Digital Libraries
Prof. Mendelzon
Linux Sources
Math contest problem
HRCanada
Interview tips
Thinking in C++(volume 1)
Thinking in C++(volume 2)
C++ goodies
Bjarne Stroustrup:C++
Scott Meyers is one of the world's foremost experts on C++ software development
Computer Networks resouces
repeater,bridge,router,switch
Dr.Combinatorial
Dr.Approximation algorithm
Dr.graph decomposition
KL algorithm for graph partitioning
Dr. Kleinberg
GTL software
Graph partitioning course
IBM Fast and effective algorithms for graph partitioning and sparse-matrix ordering