Saturday 24 October 2009

Link bombing spam ring discovered

For not to many weeks ago I wrote about how spammers are killing me and my project fuzzzy.com.
After posting the blog post I deliberately shut down the site to see what happened, hoping the spammers would leave for good. After two weeks down time I put the site online again but without the SMTP server connection so that new users would not get the e-mail confirmation during the signing up process.

Guess what. Suspicious users still registered with the same rate even if they could not log in. This continued for 3 weeks. After three weeks, I turned the SMTP server back on and in an hour or two the spam started trickling in again.

So here's the things I have tried and did not work:
  • E-mail confirmation.
  • Captcha.
  • Wrote rigorously that the site is for web enthusiasts only.
  • Wrote that spam would be deleted without notice.
  • Added a human only answerable question to the sign up form. ('Are you human?')
  • Change the URL's of the pages most used by spammers to post spam.
Another thing I tried was to add a question where the answer is commonly known to the target group. The question I added was; 'Who invented the web?'. This actually had some effect. Spam went down by about 50% after adding this question to the sign up page. It seems that most spammers don't know the answer to this question and moves on to other sites to do their spamming.

One thing I have learned is to be very careful not to deploy pages that let users enter html so users can create url's in free text. Once this gets out amongst the spam ring you will get a hard time fighting them off even if you remove the ability to add hyperlinks.

Looking at the IP addresses of about 100 the spammers these are the typical ISP's:
  • Mango Teleservices, Bangladesh
  • Philippine Long Distance Telephone, Manila, Philippines
  • Digitel Mobile Philippines Inc., Philippines
  • National Internet Backbone, India
  • FibreNet Communications Ltd.Dhaka Bangladesh
  • Smart Broadband Incorporated, Sorsogon Philippines
  • TATA Communications formerly VSNL is Leading ISP, Ahmadabad, India
  • Smart Broadband Incorporated, Quezon City, Philippines
  • Bharti Broadband, Delhi, India
  • VietNam Post and Telecom Corporation, Vinh, Vietnam
  • Telefonica del Peru, Peru
  • Grameenphone is the largest telecommunication Orga, Dhaka, Bangladesh
  • NIB (National Internet Backbone), Sivakasi, India
  • FASTER CZ spol. s r.o., Brno, Czech Republic
  • Makedonski Telekom, Skopje, Macedonia
  • SC AVA TELECOM INTERNATIONAL SRL, Bucharest, Romania
  • Vietel Corporation, Hue, Vietnam
  • Telekom Malaysia Berhad, Kuala Lumpur, Malaysia
  • PTCL Triple Play Project, Islamabad, Pakistan
  • RELIANCE COMMUNICATIONS, Madras, India
  • SIA Lattelekom, Priekule, Latvia
  • Sify Limited, Calcutta, India
  • SATNET, Quito, Ecuador
  • SC AVA TELECOM INTERNATIONAL SRL, Bucharest, Romania
The list shows that most of the spammers come from poor countries or countries with high unemployment rates.

Looking further at the activities carried out and the spam they add my hunch is that there is a link bombing spam ring. Since most automated robots don't get past captcha's and other blockers, organized spam cartels will outsource spamming to poor people in developing countries.

One might think that these spammers are the scum of the earth.. wait scum of the web. But if we look at things from a higher perspective we will probably find that the digital-divide, the socio-technical and global networked economics of the world and the immature stage in the evolution of the web is what really has caused this cancerous spam situation.

So fighting the spammers is like slapping around poor thieves caught in their act.
Bashing up the thief will only make him sink deeper into the black hole his already in. Getting rid of one spammer only leaves room for another spammer. Instead we should focus on prevention and helping people out of their miserable situation. How do we do that? Fair trade is a good solution. Another good solution is to work on innovative R&D projects that will evolve the web.





Wednesday 30 September 2009

Getting the right job

This blog post is about how to decide between job offers and choose the job you will still like two years from now.

I have a friend that was recently sacked. The guy worked in sales and in his new job he didn't live up to the expectations of the top management. Some of the reason was the recession but I believe much of it was a direct result of just choosing the best paid job without considering the things that really matter.

Most people have a very simple approach to deciding on a new job. If new job salary is significantly higher than the current salary then switch job. Others just go by gut instinct.

I myself will do a thorough analysis before starting a new job to make sure I am still at the same place in a few years. Changing job is not a task you should take lightly. Remember you are going to spend all your week there for 8 hours a day for perhaps 3 years. The process of going to job interviews is also usually quite daunting so make sure you decide with care before applying. Do not apply at a job just because the place sounded cool or the salary was good.

Think of all the noise and problems that can come from choosing the wrong job. So here's a To-Do list based on what I have done before.
  1. Write a list of all the things you look for in an employer.
  2. Prioritize what things are the most important for you. Remember to take into consideration both your short term and long term goals.
  3. Investigate, find and learn about possible employers.
  4. Try to get job offers from companies that seem to fit your list.
  5. Create a table and give scores to each company based on information from job interviews, information you get from others and what you can find on the web.
WeightCompany ACompany B
Aligned with future tech trends (avoid working with tech that will make you irrelevant for future employers)
Interesting co-workers with good personality
Salary, bonus
High spirit, positive environment, happy staff
Cool technology
Mixed workforce with diverse skills
Travel distance from home
Paid overtime
Possibility for travel
Stress
Not too much travel
Cantina with good food
Stable company economy
Good business model/vision/strategy
Competent and skilled co-workers
Long term career potentials
Short term ladder advancements possibilities
Managers are good role models
Ethical customer portfolio
Interesting projects and customers
Green environment
Nice premises/facilities
Other social benefits
Workout/health club
Possibilities to go on conferences and seminars


Possibilities to take expensive courses
Somewhere to relax in the office building
Not too much overtime
Flexibility (Don't have to be at work from 9 to 5)
Company style. Professional/informal

Fill in the table, replace the items with things that makes sense to you. Give points for each company and multiply with the weight column. At the bottom of the spreadsheet you'll see what company to choose.

And of course if you do not get any offers or know you are not going to get any offers. Go make yourself more valuable by learning new stuff, getting more relevant experience or getting a degree.


Sunday 20 September 2009

Spammers are killing me

I have over the last years developed an experimental socio-semantic bookmarking service as a part hobby and part academic research project. The site can be said to be under early Beta testing. I have a lot of plans for the site but as it is being developed on my spare time the progress is not as fast as I could have hoped for. Also, as I write the project is kind of on the shelf while I explore another exiting hobby project in the area of social location based services.

Here come the spammers
In the middle of July 2009 I suddenly saw a rise in the number of new members signing up on my bookmarking service fuzzzy.com. Surely it's some robots spamming fuzzzy I thought. But after looking into the actions performed I soon figured out the spam where human generated. Based on the type of actions and data/metadata entered I could tell these where not generated solely by automated agents. It very much looked like coordinated spamming from a spam ring. There were no pattern in the IP addresses used. Captchas and human-readable-only questions on the sign up page did not stop them. Looking in the log-files at the seconds between actions, links added in the form of bookmarks and free html text links where added and modified in a typical human workflow. Some spammers also added tags and comments.

Why are they doing this
Obviously Google pagerank is the root of all evil link bombing spam. Often called link bombing, google bombing, spamdexing, referer spam, spammers add links to sites in order to promote a site and make it rank higher on Google and other search engines using page rank and similar ranking algorithms.

Spammers killing me slowly
For the last weeks I have got about 30 spam links every day and the process of removing the spam is killing me. Instead of using my scarce time on development and learning new stuff I am tied down for 10 minutes each day just verifying links and deleting spam. 10 minutes a day is not that much but its the feeling of fighting against a mob of EVIL EVIL EVIL real world spammers that really makes me just feel sad and frustrated. On just about every page on fuzzzy there is now text saying the site is a community site for people interested in web science and web development. Still, people keep bombing the site with spam links.

So what should I do
As the site is still in sort of early closed beta I don't have a bunch of users that can report, moderate and delete spam.
The few options I see are:
  • Close the site until I decide to focus 100% on the site and a real community is built around the site.
  • Keep deleting spam every day.
  • Develop a spam blacklisting service my self.
  • Report the spam to some third party black list.
  • Develop functionality that favours user with high reputation. Links posted by new users are just not shown until the link or the user is voted up or something like that.
If you have any ideas for how to fight the spam please let me know.
The last option or similar approaches seems to be the way to go but it does seem futile to fight the spam mob. If I can free my site of the spammers they will only move on as parasites to new victims. This only shows how primitive the current state of the web really is.

Thursday 10 September 2009

Disadvantages of using frames

I often encounter developers who think it is a good idea to use frames or iFrames to implement web applications. The most often used argument is that with frames you don't have to update the entire window page. This blog post is an attempt to set the record straight and show why it is not a good idea to use frames unless you have to because of legacy reasons etc.

Benefits of using frames
  • Navigation can be fixed and always visible without scrolling.
  • Logo always visible which can be used to strengthen branding.
  • Less payload with page requests as only sections of the windows is refreshed.
  • You are able to put anything in the frames making the application seem more integrated or as one without having to open new windows or send the user to new sites.
  • If you mess up the markup of your page it might not destroy the entire layout of the site.
Most of the above can be solved without frames using different techniques although some of them would require more advanced technologies such as Ajax or DHTML.

Drawbacks of using frames
  • Facilitating search engines is more difficult and the indexing results by search engines are likely to be less good.
  • Linking to your site might gets tricky. If you are using frames you might need to set up some javascripts to recreate the frame structure.
  • The end user can not use the browser bookmarking or favourite functionality as normal.
  • Many people do not know how to copy the actual page address of a page within a frame setup. The end user can not copy the address of the current page from the browser URL address field since it probably just displays the root page url.
  • If you do not have full control over who creates pages for your frames you can not be sure that branding and style guidelines are followed.
  • Normal navigation for end users with back and forward buttons does not work as expected. (The browser address field is not updated)
  • Hyperlinks on the page inside the frames might intentionally or unintentionally open other external content in your frames, degrading user experience and branding.
  • Printing the pages gets more difficult.
  • Printed pages might loose it's contextual setup such as the breadcrumb trail and it will be difficult for the user to get an overview and find out where the page was printed from.
  • Frames might not be supported very well by many types of mobile or other type of compact clients.
  • The use of frames increase the chance of multiple and horizontal scrollbars when users do not maximize the browser window or when she has low screen resolution.
  • Frames are less usable for handicap users (see WAI/WCAG).
  • Page rendering is slower since firing up frames and doing multiple page requests usually takes more time.
  • The page rendering might seem unnatural as things are not rendered in a linear fashion.
  • You need to make sure hyperlinks all point to the correct sub frames which require additional testing.
  • Frames impose restrictions on the graphical design and layout.
  • The refresh-button seldom functions as expected.
  • You can get into copyright problems or confusion about the origin of content when intentionally or unitentionally presenting external content.
  • Many web application frameworks or content publishing systems are not designed for frames and can thus be an obstacle.
  • Presenting footer information gets more difficult to present in a consistent matter when using frames (not iFrames).
  • The user might experience the site as less trustworthy as the current page URL seems to be cloaked.
  • Use of frames is out of date and the site can be perceived as not being modern.
  • The use of frames requires more http requests which, to some degree, outweigh the benefits of less byte size payload.
  • The end user does not get a clear and consistent confirmation on when a new page is loaded.
  • Frames mess up the page metaphor and will make the site more confusing for novice end users.
  • If your frames are using session state you might encounter timeout or login forms in the wrong frames which cause a lot of confusion.
  • As a developer you might experience problems when the front end code base gets large and cross frames code and other javascripts start to make site browsing seem slow or awkward.
  • When using frames it is harder to debug front end issues.

Conclusion
Avoid frames whenever you can. If used, you should definetly know what you're doing and frames should only be used for application type of sites that do not need the typical document/page metaphor.

Monday 13 July 2009

Dynamic resizing of iFrame, pros and cons

If you are a web developer you have probably encountered iFrame resizing requirements.
Because of legacy code, your portal framework or some other reason you have to use an iFrame and because of the fancy layout from the designer you are required to introduce an iFrame that fills some section of the page that adjusts itself depending on the content that goes into the iFrame, and of course you do not want extra scroll bars appearing on this iFrame and you do not want a vertical scrollbars except from on the outer right side window edge when the page is long.

So how do you go about solving this problem? There is tons of sample code that shows what you can do but you will find few articles summing up pros and cons of alternative solutions.

Basically there are two solutions. I will explain them shortly and present a list of pros and cons. Examples of the two alternatives, I am sure you can find your self. jQuery can be used for both solutions.

Alternative 1) Resize iFrame from within inner frame
Description: With javascript on both the outer page (the one hosting the iFrame) and inner page, iFrame is resized after the inner page has checked its own height and sends this value to its parent. In order to allow cross frame scripting you may need to set the document.domain property on both pages.

Pros:
  • We get a scrollbar on the outer most right side (browser window) when the page is long. Same as on all other type of long pages.
  • Script only needed on long pages.
Cons:
  • Might require changes to browser security settings or modification of trusted sites list.
  • When the inner iFrame page has dynamic content (uses ajax/dhtml) that expands the initial page, multiple scrollbars will appear unless you attach your javascript to these dhtml functions.
  • Requires that you have control over pages that are to be hosted within the iFrame. javascript must be included on each page (or a shared include/template file).
  • Requires setting document.domain property using javascript which might cause future unexpected problems because of dependencies?.
  • Both hosted page and parent page need to have the same domain names (not the same subdomain) using the same protocol (http/https).

Alternative 2) Resize iFrame to fit page from outside
Description: Using javascript, iFrame is automatically resized to fit outer window on window load and resize event.

Pros:
  • Approach is guarantied to work (if JS is enabled). This means less risk and less chance of bugs.
  • Less complex solution and therefore more likely to work over time as you or your IT department do browser upgrades or make changes to browser settings.
  • Can be used to show external resources not under your control.
Cons:
  • The scrollbar will not span the entire height of the window. Only from the position of the top navigation menu or so and down to the window bottom.
  • While resizing the window there will be some flickering and scrollbars might seem sluggish.
  • iFrame must span entire width of window, this might put some restrictions on layout design.
  • Script to resize iFrame always have to be triggered on page load (and resize) and might contribute to slower rendering on very slow machines.

Monday 12 January 2009

Sprint retrospective helper

If you're a developer doing SCRUM having regular sprint retrospective meetings you have probably experienced the situation where you are supposed to write down what went well and what could have been done better. You try to come up with some neat suggestions but you feel that a bunch of topics have been forgotten.

To make it easier to remember stuff for the retrospective you can always start jotting them down when you have them fresh in memory so you are sure to have them on the retrospective.

The list below was made to help remember stuff for the retrospective. It's a sample list or example of keywords to jog your memory and help you brain storm. This list might also be used for small informal project evaluations etc.

Process
standup, daily, meetings, deployment, out of control, reproducing, testing, planning, demos, keeping track, improving, design, results are not lost, handling issues, stabilization, complex, analysis, integration, velocity, timing, flow, pace, think, sprint planning, charts, start and stop, costs, tendencies, peer review, progress, feedback, goals, sticky notes, whiteboard, stepping on toes, race, hand over, significant events, timeline, consensus, management, senior, anchoring.

Policies
source control, decisions, documentation, who decides, hand over, scope and deadlines, quality, staffing, QA, code review, getting credit, mission, vision, risks, bug tracking, issues, automation, analysis, refactoring, activities, accountability, user stories, phases, patterns, architecture, holism, holidays, culture, flow, dummy data, data creation, silo, not used, bounce, ripple effects, peer review, boundaries, proof of concept, skills, metrics, completing stuff before moving on, training, knowledge management, help desk, manual steps, documentation, prototype, risk analysis, cost analysis, contract.

Work environment
collaboration, work hours, new members, number of members, organization, customer, development tools, scrum tool, communication tools, roles, stakeholders, customer involvement, engagement, uncertainties, environments, optimizations, tasks, boring, exciting, change, skills, food, chaos, familiarity, fun, stress, computers, software, 3rd party, external , actors, energy, air, sit/stand, down time, remote/local, crash, branch, control, differences, conformance, competition, constructive, freedom, job security, fatigue, hopelessness, knowledge, oversight, feelings, blaim.

Communications
project participants, dependence, overview, noise, out of sync, expectations, errors, bug, reports, backlogs, requirements specifications, availability, members, stakeholders, members are up to date, lessons learned, loops, acceptance, rejection, hours, timeboxing, impediments, design, styling, wireframes, business goals, unity, talking to seniors, backlog, details, understanding, agenda, conclusions, vague, abstract, appreciations, secrets, honest, creativity, confront, good enough, impact, consequences, questions, ask, feasibility, summary.