Posts filed under ‘Internet’

Comic Chat

I remember using Comic Chat when I was in elementary school, trying it out with my new Internet Explorer install. It was the first chat program I had used, and thought it was both exciting and scary to be able to talk to complete strangers. Comic Chat is an application which generates comics from online chat, and uses the IRC protocol.

I was surprised to find this paper on Comic Chat written by the authors in 1996. Interestingly, it was published in SIGGRAPH, the top computer graphics conference in academia. From reading this paper, I find that Comic Chat is a lot more complicated than I initially thought.

Comic Chat creates realistic comics, which mainly consist of characters, speech balloons, and panels.

Characters

Generating a comic requires placing characters in a panel. Comic Chat used cues present in the text to generate the character’s gesture and expression. Things such as smileys :-), use of “I” or “you”, and punctuation would change the appearance of the character. In addition, the position and orientation of the characters is determined by a greedy algorithm. The following strip has examples of position and orientation issues: the first panel is missing a speaker, the characters in the second panel are not facing each other, and the outer two characters in the third panel are talking over the two middle characters. The fourth panel shows a correctly drawn panel.

p225-kurlander_page_04_image_0001.jpg

Balloons

Comics generally use four different types of balloons,

  1. Speech balloons for regular text, drawn with a solid outline and tail
  2. Thought balloons for what a character is thinking, with a solid online but a tail of ovals
  3. Whisper balloons for private conversation, with a dotted outline and tail
  4. Shout balloons for shouting text, with a jagged outline (not shown in figure)

 p225-kurlander_page_05_image_0002.jpg

Determining a balloon’s dimensions and placement is determined by a complex algorithm, which you can find in the paper. There are many things to take into consideration when placing balloons, such as placing them so they are read in the correct order, so they don’t overlap, so they are located somewhat over the speaker’s head, and to leave room for the tails.

Panels

Panel breaks are calculated to accommodate text properly, and to make the comic appear more natural. Breaks can be made when there are too many characters in a panel, or there is not enough room for the text. A break is also introduced when a character speaks twice to ensure a character does not have more than one balloon per panel. Panels are usually close-ups of characters to get a good view of the active character. However, a zoomed out shot is sometimes done to show the surroundings and characters in the scene.

While Comic Chat has become antiquated and few users now use it to chat online, it still has some value today. I realized when reading this paper that the web comic, Jerkcity is constructed from Comic Chat.

You can download a copy of Comic Chat if you want to give it a spin.

Kurlander, D., Skelly, T., & Salesin, D. (1996). Comic Chat. Proceedings from SIGGRAPH ’96: International Conference on Computer Graphics and Interactive Techniques, 225-236. [PDF]

December 4, 2006 at 1:09 pm 10 comments

When do you like someone like yourself? An analysis of online dating

Online dating is gaining momentum and is an easy, socially acceptable way to find partners for dates or relationships. To a social scientist, the wealth of data stored on online dating services has enormous potential in the study of interpersonal relationships. Instead of having to take surveys and interview people, scientists can now discover findings by looking at the statistics of what actually happened. Actions speak louder than words. Never before has something so human and primitive been reducible to such quantitative discrete values.

Do opposites attract? Apparently not. This study of an online dating service measures the importance of a matching characteristic when choosing a partner. The data is extracted from the contacts initiated by the users.

Characteristic Increased Contact
Marital status 1.64x
Wants children 1.54x
Number of children 1.39x
Physical build 1.28x
Smoking 1.25x
Physical appearance 1.23x
Educational level 1.19x
Religion 1.17x
Race 1.14x
Drinking habits 1.12x
Pet preferences 1.11x
Pets owned 1.08x

 

Demographic findings in this study:

  • 62.8% of members were male and 37.2% were female, but 55% of active members were female
  • The median age for men was 36 and women was 33
  • 78.2% of messages were never responded to
  • Members sent an average of 1.5 messages
  • Men initiated 73.3% of messages, but their initiations were 17.9% less likely to be reciprocated

A more detailed analysis of online dating is given in the author’s thesis.

I found this paper by browsing the list of Judith Donath’s students, who was also one of my professor’s advisor. Fiore’s Masters Thesis was about online dating — I bet that made for interesting party conversation.

Fiore, A. T. & Donath, J. S. (2005). Homophily in Online Dating: When Do You Like Someone Like Yourself?. Proceedings from CHI ’05: Conference on Human Factors in Computing Systems, 1371-1374. [PDF]

September 27, 2006 at 8:47 pm 50 comments

Massive Multiplayer Online Games as “Third Places”

A major concern of home media such as television and the Internet is that they are replacing essential social institutions and community. While a previous post has indicated that this might not be true, this research paper looks at massive multiplayer online games such as World of Warcraft to determine if they are indeed “third places”.

What is a third place? The first place is your home, where you can relax and be comfortable. The second place is where you usually are when not at home — work; work provides social interaction and sense of community. Howard Schultz, founder of Starbucks introduced third places as somewhere besides home or work where people can socialize and feel comfortable. Think Cheers.

Online games are thus third places as defined by the eight characteristics of third places.

Neutral Ground: Individuals are free to come and go as they please. In online games, players are not obligated to play; joins and quits are not significant events.

Leveler: An individual’s rank and status in society are not significant. As in the culture of early video game arcades, “It didn’t matter what you drove to the arcade. If you sucked at Asteroids, you just sucked.” Players on online games use a separate avatar unrelated to their real life person, and social status is rarely invoked.

Conversation is Main Activity: In third places, conversation is the main activity that the individuals participate in. While debatable as the main activity in online games, players would not disagree that conversation plays a crucial role. Often, conversation drifts to real world discussion such as personal life, politics, culture, etc.

Accessibility & Accommodation: Third places are easy to access and accommodating to individuals. Online games allow players to log on and off at will and there are always players online. Activity occurs throughout all hours of the day.

The Regulars: Regulars are those who give the place its character, and attract new individuals. Guild members, who form a clan to play the online game together, and squatters, who stay within an area of the game, are the regulars of the online world.

A Low Profile: Third places are characteristically homely and without pretension. The population of online games follow a parabolic curve; after the onset of players following the release, the regulars remain while many move on to higher profile games.

The Mood is Playful: The general mood of a third place is playful and witty. Players in online games crack jokes during heated battles, perform goofy actions with their avatars, and mock each others’ appearances. Rarely are players overly serious about game matters.

A Home Away from Home: Rootedness, feelings of possession, spiritual regeneration, feelings of being at ease, and warmth. Online games possess a homely atmosphere where players notice others’ absenses and makes the overall feel of the game “warm”.

Social capital is analogous to financial capital in that it can be acquired and spent, but for social gains instead of financial gains — for example, to be comforted or receive advice. Bridging is when individuals connect with those from different backgrounds. The advantage if bridging social capital include gaining access to new information and resources. Bonding is when individuals that are already close provide support for each other, making the relationship stronger. In a sense, bridging provides breadth while bonding provides depth.

In online games, players come from a diverse background so they are usually bridging social capital. However, it’s not uncommon for a bond to grow during an online game if individuals player together for a long period of time.

Online games fit the definition of a third place, but as players become more hardcore and focus more on gaming, their function as a third place wanes.

I read this paper after attending a related talk by one of the authors, and you might find his other publications just as interesting.

Steinkuehler, C. & Williams, D. (2006). Where Everybody Knows Your (Screen) Name: Online Games as “Third Places”. Journal of Computer-Mediated Communication, 11(4), article 1. [HTML]

September 19, 2006 at 10:32 pm 44 comments

The Impact of Communication Technology on Lying Behavior

Lying is a frequent, and sometimes necessary part of our lives. A study finds that 26% a person’s of overall daily interactions involved some sort of deception (1.6 lies/day on average).

But how does technology impact the number of lies we tell? Researchers asked 30 volunteers to record their daily interactions and lies told during the interactions.

Four modes of communication were investigated:

Phone: 37% of phone calls involved deception.

Face-to-Face: 27% of face-to-face conversations involved deception.

Instant Messaging: 21% of IM conversations involved deception.

Email: 14% of emails involved deception.

Two existing theories fail to explain the difference in lying frequency among technologies.

The Media Richness Theory says that people will lie more if the medium of communication is richer. However, the data contradicts this because lies during phone conversations occur more often than lies in face-to-face communication.

The Social Distance Hypothesis claims the opposite — people prefer to lie when the medium of communication is less rich. This is because it’s more difficult to detect, and because lying makes them nervous. However, this is also contradicted by the results since face-to-face lies occur more so than lies over email.

The paper presents an alternative theory — the amount of lying is affected by whether the medium of communication is asynchronous, recorded, or if the persons are in the same physical location. Phone conversations feature none of the above, so lying is most likely to occur. On the other hand, email is only distributed, so lies happen least frequently over email.

However, aren’t the functions of each medium somewhat different? Email is more likely to be used to make announcements or detailed plans. It would seem that this type of communication is unlikely to contain a lie. Instant messaging is often used for quick exchange of information and so there is also not a lot of room for lying there. The study also mentioned but didn’t take into account the difference in length of communication between mediums, which would likely skew the results. Hence, it would seem that the likelihood of lying is less discrete than suggested by the paper and instead influenced by a wide range of factors.

One other finding from the study is that lies were more likely to be premeditated when over email; this may be somewhat obvious since email is the only asynchronous form of communication investigated, giving the liar more time to perfect the lie.

Hancock, J. T., Thom-Santelli, J., & Ritchie, T. (2004). Deception and Design: The Impact of Communication Technology on Lying Behavior. Proceedings from CHI ’04: Conference on Human Factors in Computing Systems, 129-134. [PDF]

September 16, 2006 at 2:11 am 9 comments

How personalization and authentication affect Internet surveys

Internet surveys are an efficient way of collecting information. They have been shown to increase self-disclosure for sensitive questions, and also reduce “good” answers (more socially acceptable responses).

An interesting dilemma occurs when the participant comes across a question they might not want to answer, such as, “What is your salary?”

They can choose to passively not answer (no response, or the default choice), or they can actively not answer (selecting the option “I prefer not to answer”).

Authentication is when the participant needs to log in to take the survey, as opposed to going to a URL that encodes the participant’s information in the address. One of the studies shows that “I prefer not to answer” was chosen more often when authentication (log in) was used, versus when the URL encoded the information.

Personalization is when the email inviting the participant to do the survey had a salutation that identified the participant. The salutation would be something like “Dear Napoleon” instead of “Dear Student”. In the study, surveys where the invitation email had a personalized salutation did not generate a significant difference in non-responses to sensitive questions. However, it did ratio of active non-responses (“I prefer not to answer”) to passive non-responses (skip the question) increased.

To conclude, reduction of anonymity reduced responses to sensitive questions. However, it also encouraged participants to answer questions “better”. It’s interesting that such a minute detail would have a significant effect on responses to sensitive questions.

Joinson, A. N., Woodley, A., & Reips, U. (2007). Personalization, authentication, and self-disclosure in self-administered Internet surveys. Computers in Human Behavior 23(1), 275-285. [PDF]

September 12, 2006 at 3:39 pm Leave a comment

Looking back at Search Queries from 1997

In light of the recent search query logs released by AOL, I perused some to see what others have been searching for. There’s a bit of voyeur in all of us.

My initial reaction was that search queries have gotten a lot more sophisticated since the 1990s, where common searches were “free downloads” or “britney spears”.

This paper looks at over 1 million queries from the query logs of the Excite search engine from September 1997. The top 25 queries were (in order): and, of, sex, free, the, nude, pictures, in, university, pics, chat, for, adult, women, new, xxx, girls, music, porn, to, gay, school, home, college, state.

Findings from the study:

  • The mean number of terms of query was 2.4
  • Less than 5% of queries used Boolean operators (AND, OR, NOT, +, -, “”, etc.)
  • 48.4% of users submitted a single query, 20.8% two queries, 31% three or more
  • Modified queries usually added additional terms rather than removing them
  • 28.6% of users stayed on the first page of results, 19% looked at two pages

2.4 terms per query seems a bit low. A glance at the AOL query logs (totally not scientific) gives me the impression that people generally use more terms nowadays. Also, I would say the term “free” would be rarely used nowadays, since it’s been beaten to death by every commercial website out there. Basically, search queries have gotten more sophisticated since 1997.

One interesting remark,

That is, close to half of the users looked at two or less pages. Were users so satisfied with the results that they did not need to view more pages?

To me, this is surprising — I rarely go past the first page nowadays on Google. Did we really browse through pages and pages of search results back then? Tedious.

Spink, A., Wolfram, D., Jansen, M., & Saracevic, T. (2001). Searching the Web: The Public and Their Queries. Journal of the American Society for Information Science and Technology, 52(3), 226-234. [PDF] [HTML]

September 6, 2006 at 4:48 pm 2 comments

How Spammers Steal Your Email Address

Ever wonder who’s scraping your email from a website for spamming?

Project Honey Pot is a project that aims to analyze email harvesters by setting up honeypots on hundreds of thousands of websites. They have some interesting findings about the geographical source of harvesting and processing, sending patterns of different types of spammers, and email list management behaviors.

Email harvesters can be categorized into two types, termed “hucksters” and “fraudsters”.

Hucksters have a longer delay between the time they harvest the email address to the time a spam is sent there. They have more sophisticated harvesting algorithms, generally send a large volume of spam, and their emails typically sell a product.

Fraudsters almost immediately send a spam email once they harvest the email address. They send a small number of messages to each email address, and their emails typically involve some sort of fraud (phishing, “advanced fee” fraud, etc.).

My thoughts are that Hucksters are a more organized group of spammers that as a group create email lists, send bulk email, and sell products for profit. Meanwhlie, the fraudsters are simply individual spammers looking to make a quick buck.

The geographical origin of harvesters and spammers breaks down as follows,

Harvesters

United States 32.1%
Romania 17.1%
China 12.3%
United Kingdom 8.6%
Japan 7.2%
France 6.9%
Spain 4.3%
Egypt 4.0%
Nigeria 3.7%
Canada 3.7%
Spammers

United States 38.4%
China 14.9%
Korea 13.4%
France 7.6%
Brazil 6.3%
Japan 5.3%
Taiwan 4.0%
Spain 3.6%
United Kingdom 3.6%
Canada 2.7%

Note that there seems to be some sort of apparent “outsourcing”, since Romania is the #2 country for harvesting but doesn’t appear in the top 10 for spamming.

So what are the most effective ways to munge (obscure your email address from harvesters) your email on a website?

  • Putting the email address in an image
  • Using Javascript to render the address (harvesters are unlikely to execute Javascript)

For the latest Project Honey Pot statistics, click here.

Prince, M. B., Holloway, L., Langheinrich, E., Dahl, B. M., & Keller, A. M. (2005). Understanding How Spammers Steal Your E-Mail Address: An Analysis of the First Six Months of Data from Project Honey Pot. Proceedings from CEAS ’05: Conference on Email and Anti-Spam. [PDF]

September 2, 2006 at 1:05 am 9 comments

Older Posts


Feeds

Read any good papers lately?

If you're interested in academic research, I'd love to have additional contributors. Shoot me an email.

Contact