Syracuse University ยท Data Lab

๐ŸŽ TUAW

TUAW is a weblog dedicated to disseminating information on Apple products and services.

10K
Nodes
11
Edges
0.0
Avg Degree
No
Missing
Network Statistics
10K
Total Nodes
11
Total Edges
0.0
Avg Degree
Blogging
Category
Size Relative to Repository Maximum
Nodes
10K
Edges
11
Nodes & Edges โ€” Repository Comparison
Highlighted bar = this dataset. Logarithmic scale.
Edge-to-Node Ratio
Network density indicator
Dataset Details

Source

Identifying the influential bloggers in a community

Nitin Agarwal+, Lei Tang*, Huan Liu*, and Philip S. Yu^



^

Dataset Information

Each instance in the dataset represents a blogpost and consists of the following 12 attributes

1. Name: Title
    Type: String.
    Info: This attribute represents the title of the blogpost.
    Missing Values: No
2. Name: Date
    Type: String
    Info: The date the blogpost was posted on TUAW.
    Missing Values: No
3. Name: Blogger
    Type: String
    Info: Author of the blogpost.
    Missing Values: No
4. Name: Categories
    Type: String(Separated by :&:)
    Info: Category of the blogpost.
    Missing Values: Yes
5. Name: Post
    Type: String
    Info: Text from the blogpost.
    Missing Values: No
6. Name: Post_Length
    Type: int
    Info: Length of the blogpost.
    Missing Values: No
7. Name: No_of_outlinks
    Type: int
    Info: Number of references or outlinks in the blogpost to external content.
    Missing Values: No
8. Name: No_of_inlinks
    Type: int
    Info: Number of links citing this particular blogpost. This data was retrieved by using the link search feature of Technorati.
    Missing Values: No
9. Name: No_of_comments
    Type: int
    Info: Number of comments received by the blogpost.
    Missing Values: No
10.Name: Comments_URL
    Type: String
    Info: Permanent URL to the comments page.
    Missing Values: No
11.Name: Permalink
    Type: String
    Info: Permanent link to the blogpost.
    Missing Values: No

Attribute Information

The dataset consists of blog posts crawled from The Unofficial Apple Weblog(TUAW).TUAW is bogsite dedicated to Apple products and services.
The blogsite consists of a closed community of bloggers, where other users are allowed to comment on the blogposts. The dataset consists of
blogposts from the period January 2004 till February 2007, in addition to metadata like the number of inlinks.
How to Cite
If you publish material based on data from this repository, please acknowledge the Data Lab Social Computing Data Repository at Syracuse University in your acknowledgements. This helps others find and replicate your work.

APA Format

R. Zafarani and H. Liu. (2026). Social Computing Data Repository [https://datasets.syr.edu]. Data Lab, Syracuse University.
@misc{Data Lab:SU,
  author       = {R. Zafarani and H. Liu},
  year         = {2026},
  title        = {Social Computing Data Repository},
  url          = {https://datasets.syr.edu},
  institution  = {Data Lab, Syracuse University}
}