Wanna become a data scientist within 3 months, and get a guaranteed job? Then you need to check this out !
From neural networks to general AI
Neural networks are the backbone of deep learning, a branch of machine learning. Artificial intelligence (AI) is considered to be a subset of general AI. General artificial intelligence is the ability for a computer program to perform tasks that would otherwise require human intelligence or expertise. Neural networks work by filtering information and attempting to find patterns in that information.
One such neural network is called the Hopfield network; its typical use case includes pattern recognition tasks. The way it operates is by spreading information out through a series of nodes, across connections between them (that can be positive or negative). Once this has been done, it’s then possible for an individual node to “remember” what connections were made and use these as guidelines in order to make connections with other nodes in future iterations. In this way, it’s possible for the network as a whole to store and retrieve data over time – something that might be akin to memory in humans.
We discuss about this with Tom Burns, who is a PhD researcher in Okinawa Japan, dealing with topics such as computational neuroscience and general AI.
AI, data science and more
If you are interested in topics such as data science, AI, or blockchain, make sure to get in touch. Our team will be more than happy to help you out and answer any queries which you might have.
Transcript: Tom Burns on Hopfield Networks and the nature of General AI
Stelios: Hi, everyone. Welcome to the Data Scientist Podcast. I’m very happy to have here with us today, Tom Burns, a Ph.D. researcher in Okinawa, Japan who’s working in the area of Neural Networks Computational Neuroscience. Hi, Tom.
Tom Burns: Hi.
Stelios: Would you like to introduce yourself and tell us a few things about your work? You’re probably going to do a better job of it than what I’m going to do.
Tom: Sure, yeah. So, I’m originally from Australia and currently I’m a Ph.D. student in Okinawa, Japan as you just said, and I’m interested in how the brain is basically so intelligent. In particular, I’m interested in how neural networks can do all the things they do both in biological neural networks and artificial neural networks and I’m quite interested in memory and navigation, as well as how we can use geometry and topology in ideas for mathematics to improve those tasks or understand the tasks in your networks.
Stelios: That’s fascinating. So, those of you who are regular listeners to our podcast or regular viewers, for those of you who follow us on YouTube, you know that most of our episodes are around things like data strategy, etcetera. But [inaudible] just realized this episode is going to be a bit more theoretical. Computational neuroscience, generally, these kinds of areas are really passions of mine. So, I’m very happy to have the opportunity to speak with Tom today about these topics who’s a researcher in this area.
That being said, I’d like to ask you, Tom, a few things about your work. I know you’re working on hopefully, networks and attractors, that’s something very interesting because that’s an area of research that has been going on for decades. I know a few decades ago, which was before deep learning really took off or maybe before even deep learning was discovered, there were some researchers try to use this kind of network for different applications but things didn’t really work out. What do you feel is different now compared to 20, 30, 40 years ago when these types of neural networks were discovered both in AI and applications, but maybe also in cognitive science and how we approach these types of networks and models?
Tom: Yeah. Well, that’s a very big question and as you know back in the 80s basically, John Hopfield created this model, which we now call the Hopfield model and it was an associative memory network where each of the neurons took on binary values plus or minus 1 or you can use one and zero, and you have some number of these neurons. Let’s say you have 100, and then what you do is you store memories in this network as sort of binary activations of the neurons 1 or 0(zero) plus minus 1. If you do that in a random fashion, it turns out that theoretically, you can store about 14 patents, if you have 100 neurons, until when you add say the 15th or 16th pattern, then you can’t sort of pattern match the patterns. It means you can’t recall the patterns.
If you give part of the pattern in this network, then you can retrieve the whole network. For a long time, they thought of as a theoretical concept about how to store and treat memories in not only maybe neural systems in general like biological neural networks, but also, with some applications. These applications are very limited as you said. First of all, it’s limited in the data structure, right? You can only use these binary values, but second of all, it’s quite limited in terms of what you can store in the number of items of memory that you can store in and how you retrieve them and how these things are used.
For a long time, it was like that, and people do different extensions, different types of technical work. Theoretical people, especially physicists, did a lot of work on finding out these theoretical capacities because they turn out to have a correspondence to different physical systems to do with magnets and atoms and things like this.
Specifically, if your viewers are interested they can look up spin glass systems. But what happened more recently in the last decade and especially the last five years is that we started to realize that actually you can have – it was discovered a long time ago as well, back in the set of 90s I think – that you can. still having binary value. You can extend this to continuous values between let’s say, 0 and 1, but then this had some technical problems. But then in the last 5 years, we’ve realized that we can overcome that. We can store even more patterns and we can store these continuous value patterns. Maybe without going into all the technical details because it’s a long technical story, but it turns out that transformers and the attention mechanisms used in transformers in modern neural network is actually doing something basically equivalent to the Hopfield network update rule, which is how the Hopfield network starts to retrieve a memory and continually retrieves in memory.
So, it’s doing an associative memory recall task, and what’s interesting in particular about transformers is that it came from a totally different line of research, and it turns out to be basically just serendipitously falling on the same sort of intuition and cognitive or computational structure as Hopfield credit it back in 1982.
Stelios: That’s fascinating, and if my memory serves me well, Hopfield network is our current neural network. right?
Tom: Yes. That’s right. Yeah.
Stelios: Whereas the transformer is based on convolutions which are sincerely more of a feed-forward model. I remember there was lots of fascination around the recurrent neural networks because in, I guess some might say that they have in some ways let’s say infinite capacity because of this way they work, but how do you find recurrent neural networks in research and applications these days? Because I feel that while they are quite powerful, I think they’ve been overshadowed by convolutions. Like, you just gave this example, I guess because, from a computational perspective, they’re just easier to handle and I don’t know what you’re saying in this area.
Tom: Yeah, I think that’s exactly right. It still appears that way to me. Granted that we are the ones that make the computers, right? The architectures, which make one of these structures more efficient than the other. I think that, yeah, it’s an open question as to which one is better. No doubt our brain is a recurrent neural network. However, that doesn’t mean that we have to model everything off the brain. The classic example of an airplane and a bird and you know that the engineering trope that is associated with that.
But my opinion actually is that, we don’t necessarily have to use a recurrent neural network to use ideas developed over many decades in Hopfield networks because, in theory, the idea of transformers is to apply sort of one step of a recurrent update, and just treat the vector length or new neurons in one layer instead of doing a convolution, well, do something like convolution, and just have the same length vector as the next step. You could sort of theoretically think of those neurons as identified in terms of theoretically analyzing it from the recurrent neural network perspective, which is not intuitive or physical.
I take it in the feedforward or completion perspective. But if you at least just grant that, then from there, you can I think do everything. You can do everything theoretically as you did before in the recurrent networks.
Stelios: Okay, and so what are the benefits of studying Hopefield networks and attractors these days in the current state of neural networks and cognitive science?
Tom: Well, transformers themselves have already been proven hugely effective, and the question is, why I guess. Why are they so effective, and how in particular could that be even more effective? Even that we see that certain particular modifications to the transformers for different modalities or different tasks can improve their performance. But the question is, why and there are their intuitions and sort of heuristics that people use here and there is developing theory, but I think given that we have now, there’s treasure trove, this secret tunnel back to an amazing wilderness, which is now looking a bit dusty on the shelves that the Hopfield network papers decades-old, a lot of them.
But given we have this connection, I think that we should try to exploit it, and we should try to look back and see what those old papers are doing. What are those old ideas doing? Can we recast or reinterpret some of those old ideas into transformers because they were very general, very theoretical, most of them, and there’s already a basically a wealth of data back or your knowledge back there that we can still use today, but it requires some math? So, I’m not necessarily going to go into all the details with that in particular, but I can speak of a couple of examples if you’re interested.
Stelios: That’s interesting. What I wanted to ask you next is around general artificial intelligence. So what’s your feeling about this? Obviously, everyone has an opinion. None is always the truth. So, do you envision that maybe some of the work that you’re doing in Computational Neuroscience is going to eventually lead to general AI, like recurrent neural networks or maybe some of the old networks revisited with more computational resources, or do you think that maybe we’ll do something like general AI just simply using the current deep learning architectures with just making them bigger and bigger? I know the deep mind they are creating this very big network, which tries to do multiple tasks in one go. Like little pictures, I don’t know. It generates stuff, understands language, all that. So, what’s your intuition about this, and what the researchers are saying in this area?
Stelios: I think a lot of people at least in my neck of the woods will say, “No, there’s not enough there yet to make general AI. However, I don’t think that we even have a good definition or agreement on these terms and what qualifies or not. So, that itself is problematic, and I think will lead some people to claim that we already have that or we already have sentient AI. or something. We may well have it under some definitions. So, I think it’s maybe more of a philosophical question, and I think an unsettled one for now, but at the very least, I would like to see more sorts of cognitive science, computational neuroscience, and even other forms of mathematics in AI. I think that they will actually be very useful if we do that.
Stelios: Yeah, actually if we look at, let’s say, 50 years back into the past then we’ll see that cognitive science was more integrated with AI in linguistics. Now, it seems that things are segmented so much that some researchers might be saying, “Wow, I’m only doing deep learning,” might not even say machine learning anymore.” Do you think that in the search for general AI, we might observe a reintegration of different fields or in cognitive science?
Tom: I think we have the top, and I mean, just as one example, young Laocoön[?] had a white paper, I think earlier this year that basically was advocating for this sort of thing. He’s not alone in that and is certainly not the first to do so. But he’s a big figure in that space, and I think that was great, that he sort of acknowledge that. But I think Hinton Benjie, all the big players are sort of hinting at and have for a long time been sort of saying, “We should try to reunite things a bit more.”
Unfortunately, there’s a bit of a cultural issue or misunderstanding, I think, sometimes between these different fields. People ultimately care about different things as well. If you’re an engineer that cares about sort of outcomes and products compared to if you’re a scientist, and you just care about the nature of things, those are totally different paths people might be interested in pursuing or outcomes that people are measuring. So, I think there are some cultural problems that we will encounter. I think if we can overcome them, then we’ll enjoy the fruits of all of our different expertise and all of our different resources.
Stelios: Yeah. I think maybe part of the problem, in this case, is that the AI has been commercialized to a large extent. So, I guess the large part of the money for research and development is coming from Big Tech and they’re primarily interested in applications. So, I guess, when I remember, for example, reading about deep mind last year or 2 years ago. There was this criticism that they hadn’t produced enough value for Google. They were too academic, and I guess you’re always going to have this kind of tension. So, if you say, “I’m a cognitive scientist,” you do know that whatever it is you’re working on, it’s probably not going to have an immediate application, right? We can’t even come down the line.
Tom: Yeah. Exactly, and I mean, I think that the easiest example I can think of in this general claim is that we use vector spaces. Basically, everywhere in machine learning. Why? Well, I guess because the computer structures that we have set up and commercialized and are abundant to those that which rely on those linear sorts of algebraic structures. That’s okay, but there are other structures. There are other kinds of things out there and certainly, I think, yeah, if we ponder on what is a computer, what is intelligence, long enough, we realize that it’s not linked at all. It’s not obvious at all that we should rely on all the different structures that we have in our current sort of arsenal. That’s actually quite a limited subset or a limited set of definitions about what intelligence is and what a machine can do or be.
Stelios: That’s a very interesting point and the risk of getting a bit technical. If we weren’t using vectors, what data structures do you think we could use in order to represent knowledge and computation?
Tom: I mean an obvious area and one which is already being explored in graphs, and we have vertices and edges, or all these networks as well with nodes and connections between the nodes. The classical example for anyone that doesn’t know what such a structure looks like is a social network where the people are the nodes or vertices and then the edges are like the social connections between people like Facebook or something or Twitter. But you can even have high-dimensional generalizations of that.
You can have complexes, for example, and these are starting to get away from geometry, and you go down to a topological structure. After that, you can even go down to just an algebraic structure. I think for example structures like – well that there was a recent paper on sail arXiv. This is the idea of attaching different objects above the vertices and edges that was recently done for a neural network, but still that was just attaching vector spaces above these. So, it wasn’t particularly interesting. Yeah, I think basically there are many more structures out there which are interesting and which could be visible.
Stelios: Yeah, that’s a good point. I remember many years ago when I was studying cognitive science. There was this course talking about knowledge representation and it was essentially talking about different mathematical models of representing knowledge within cognitive science. It looks like everything is dominated by vector spaces, but there are other ways to represent knowledge.
Going back to your example about the airplane and the bird, for those of you who don’t know about it, there’s this argument that just because we know that birds can fly doesn’t mean that creating a flying machine means that you have to imitate birds 100%, right? Because we don’t. So, it’s the same with general AI, just because we know that the brain is intelligent doesn’t mean that we have to imitate it 100% to create intelligence.
We’re going back to this example, we don’t know which would be the best data structures for intelligence, I guess. Some of the things like around vectors and tensors and all of that, they were a bit accidental as in neural networks in fact being very effective. Then someone realized, “So I can use GPU for that because GPUs are essentially optimized for linear algebra and one thing leads to another, which was I guess a bit unintentional. In video never intended to get into AI and now we have imagined maybe 30 years ago that tensors would be so popular in this type of computational machine.
Tom: Yeah, exactly. I personally think things like lattices could be extremely useful. Yeah, there are so many different types of structures out there basically, and people are unfortunately sort of stuck in what works, no doubt it works, and so maybe that’s a reason to stick with it, but I wonder what would history be like if we had have just happened upon a different sort of data structure that in the video, to make computer games and whatever else they optimized for whether or not a different set of mathematics were being our basis that we’re now using.
Stelios: Yeah, I completely agree. So, I guess only time will tell. So, before we go, if you could make a prediction for the next let’s say 5 to 10 years, what do you think is going to really stand out in the world of AI and neural networks and cognitive science, if you could say, for example, one thing?
Tom: I think it’s kind of a theme of what we’ve been discussing. I think people will come to realize that there are deeper connections between these structures and that can be sort of useful going in both directions to understand the brain and to understand and create better AI. In that process, we’ll probably discover some new structures and techniques which are not currently mainstream, I think in either field.
Stelios: Yeah, sounds good. Yeah, I guess my prediction would probably be something along the same lines. I wouldn’t predict, I won’t be as bold as saying, “oh we’re going to have a general AI in five years, ” or something like that. I don’t think that’s going to happen. We’re definitely going to have, express some incremental progress or maybe some integration of different fields, which is definitely a good thing. It’s definitely a good thing. I would probably get us closer to understanding the human mind and also what to tell just really isn’t. How can we actually create more intelligent machines than those we currently have? So yeah, I think our time is up, but thank you, Tom. It was a very interesting discussion.
Tom: Yeah, thank you. Thanks very much.
Stelios: Thanks everyone for staying here with us today, we hope to see you again in some other of our podcasts. Make sure to check out the data scientist.com for more content about AI, data science and blockchain. Before I go, a couple of notes, if any of you are interested in pursuing a career in data science, make sure to go to the data scientist.com and drop me a message. I’ll be very happy to help you out because we’re starting a new program for aspiring data scientists. Also, if any of you are an entrepreneur or CEO or manager and you need some help with data science in your organization, I’m more than happy to help you out. We have a network of partners specializing in different areas of applied data science and related fields. So, just make sure to get in touch. So, thanks everyone, and hope to see you soon. Thank you.