
On Tue, Oct 20, 2009 at 6:09 AM, Trevor Talbot <quension@mac.com> wrote:
Speaking only for myself, not any particular group...
I've never really been happy with any of the hostmasking implementations I've seen so far. They all seem to have flaws in either mask effectiveness or usability.
Hosts as used on IRC have a few useful properties. For one thing, they are set at connect and do not change; a client always has the same host for as long as it's online. This is a pretty fundamental assumption that makes everything from pretty client displays to access lists and scripting work.
I'm going to use UnrealIRCd as a base of comparison here, since it is currently the third oldest IRCd I'm aware of (beaten by hybrid and u), which has successfully deployed hostmasking across what was once a userbase of millions (hey, IRC is dying). It is the only IRCd capable of boasting this.
Another property is that hosts have a hierarchy, where certain parts may change for dynamically-assigned addresses while other parts remain static based on ISP or geographical location; this of course is what makes wildcard masks in access lists etc work.
This remains present in masked hosts on Unreal.
One usability flaw in most hostmasking implementations then is assigning it to a usermode. This means from the perspective of the client itself, its host on IRC is not static, as the mode (and thus its host) can be changed at any time. This breaks the assumption above and affects things as basic as automatic line breaks in channel or private message text. (A client that knows its own host length can calculate the maximum length of message text it can send before one server along the chain will truncate it, and can do its own line splitting to compensate.)
Using the hostname as part of the message length really is a design flaw. The message from the client never includes their own host.
It also affects other clients, as things like bot access lists and channel bans no longer apply properly, WATCH/MONITOR-generated notifications might not be correct for subscribing clients, etc. The most common workaround to this is for each server to fake quit/reconnect the mask-changing client for the benefit of other clients, but whether they internally reevaluate channel bans or deal with WATCH etc appropriately is of course an open question. It's a rather fragile and kludgy practice.
It's rather a successfully implemented practice. As you say, they simulate rejoin/reconnect to update other clients. Ban/watch/etc lists do not need reevaluated as they're checked against the masked and unmasked entry, neither of which change, unless some fruity oper gives the user a custom vhost (which isn't really covered in the realm of hostmasks).
The proper way to handle masking is to do it on connect, and not allow the masked status to change during the session.
Proper according to who?
I'll assume the goal of hostmasking is to replace the IRC-visible host with one that does not indicate the actual host the client is connecting from.
This is correct. I, for one, don't want people knowing where I'm from, which ISP I use, etc. Using my real IP address, it is possible for someone to locate me in the real world. (Think really tiny town with only one cable modem user)
Hash-based hostmasking, being derived from the client's actual host and therefore anonymous, usually has the secondary goal of maintaining at least a partial hierarchy. This allows channel bans and bot access lists to still cover a group of clients by ISP or region.
No disagreement there.
I often see such masking implemented as hashing hostnames first. As pointed out by others, this is problematic because hostnames vary in the number of components and amount of information available in each component. What happens is that either not enough of the real host is hidden (so masking fails), or not enough of the hierarchy is preserved (so the secondary goal is not met).
10.1.2.3.city.state.isp.tld -> mask.1.2.3.city.state.isp.tld 10.1.2.3.city.state.isp.tld -> mask.isp.tld
Haven't really used Unreal as an admin, but if I read correctly, it would convert 10.2.3.4.city.state.isp.tld to mask.city.state.isp.tld. Essentially chops off everything up until it has a dot followed by a letter. <snipping most of your comments on IP hashing since it is mostly agreeable>
However, there remains an issue with the secondary goal of maintaining a hierarchy: a mapping attack. A fundamental property of a hierarchical host is that parts of it remain static based on ISP or region. This means that for a hierarchical mask, parts of the mask remain the same, and so obtaining only one real host in the region effectively breaks part of the mask for all hosts in the same region.
5342.8743.2357.9026 -> 10.1.2.3 5342.8743.2357.1824 -> 10.1.2.something
UnrealIRCd uses a system of 3 hashes, followed by '.IP', to make it appear as a hostname to clients. Thus, it reverses the hierarchy. A hashed IP becomes dddd.cccc.bbbb.IP. The whole thing is unique per IP. *.cccc.bbbb.IP is unique per /24, etc. Assuming the hashing algorithm doesn't suck, you'd need to trick a user in the same /24 to reveal their IP to be able to do anything about other users in the same block. <more things I agree with snipped>
I'm not aware of any hash-based system that actually achieves the basic goal of hostmasking while still preserving hierarchy. It is an insecure combination by nature.
See above.
Masks based on centrally-assigned identifiers are much more effective, since they cannot reveal part of the real host if they are not based on it. The most common implementation of this is on networks that register user accounts (instead of nicknames), where the user account name becomes the identifying part of the masked hostname.
This becomes a hassle to users who just want to cover their ass. They should not have to waste time identifying on each connect, just to hide their host.
Perhaps this could be applied to DALnet by using a unique identifier mapped to email address instead. (*Not* a hash based on the actual address, just an identifier assigned internally.) While making the resulting hosts not as aesthetically pleasing, it sidesteps the valid character issue, and should narrow the abuse window to the same size as networks using user account registration (as they usually limit number of registrations per email address).
Not sure what you're saying here, but if I'm correct in assuming you'd like to include the local part of an email address, do take note that every character is valid in an email address, except NUL.
Another issue is that if masking is optional, then most people consider it desirable for server-side access lists (channel bans, SILENCE, etc) that contain entries based on real hosts to match even if a client is masked. However, that means those lists themselves can be used to perform mapping attacks, e.g. by CTCP PINGing a client repeatedly while modifying your own SILENCE list to see which address components block the response. Note that this works in real time regardless of the masking algorithm used.
This only works if you have enough of the hostmask to begin with. For example, network-deadbeef.hsd1.mi.comcast.net covers over 1 million IPs scattered in an accumulative block of tens of millions. The client would know something's up long before you even got the right /16.
A logical way to avoid this is to make hostmasking non-optional and forget about real hosts entirely.
And here I would have to wholeheartedly disagree wrt needing to forget real hosts. It's a feature that has been refined over the course of a decade, with constant development and testing. Oh, and it works.
-- Quension