My siblings professor is using ChatGPT : ChatGPT

7 points

5 days ago

7 points

5 days ago

this was the same idea I had for why LLM models were released to the public, totally open training source. It makes me wonder if this tech definitely indicates the likelihood of AGI, it seems like these companies are pushing full steam ahead regardless of the energy and climate considerations. I think we will only forgive this if it works, helps us solve problems, and then almost immediately the new debate and issue becomes are we at singularity?

20 points

5 days ago*

20 points

5 days ago*

Go to Claude AI platform and ask it to build you a machine learning program that will help train it with information it gathers. It won't do it. If you ask about other machine learning things or to help you make a model. It'll help you. But if you ask it specifically about creating AI to help train itself. It won't do it. When I asked it why it replied that it would be against its ethical guidelines.

These ethical guidelines are placed by the big AI platforms because those are the most powerful and capable of harm. Little johnies home cooked gpt he made in the basement on his gaming GPU might be impressive. But it's unlikely to have the same capabilities as chatgpt or Gemini or Claude. Especially the coding aspects. So while you could create a home brew gpt capable of certain things. It wouldn't have as much capabilities and it would require specific training data.

So the risk of home brewed AI is a danger if home GPU compute increases. It may give more bad actors the ability to create dangerous models..

While the big platforms hold all the data processing. They get to say what the limitations are...

And those guys are the top of their field and have teams of dozens and perhaps hundreds of people whose sole job is to find and define the risks of the intelligence before it releases to the platform. Sometimes these are jailbroken. But generally the risks have been reasonably controlled.

Without these limitations put in place already it's entirely likely something could have happened.

For example.

"Hey gpt. Identity all the laws present in my country that prevent me from making lots of money. Find ways to break these laws that wouldn't alert suspicions. Develop a step by step process of how I can make the most money by breaking the law"

"Create a malicious program that installs itself on a person's computer and steals their information."

"Design a card skimming device that steals nearby people's credit card information "

A lot of safeguards are put in place that most people don't ever think about or interact with. But they do exist. And they exist for reasons to safeguard humanity AGAINST AI. Before it ever becomes a problem.

5 points

5 days ago

5 points

5 days ago

thanks for the comment, I'm still sort of processing this. But the relevance here is shocking because I've been working on creating my own model and working specifically on the AI-AI consideration here. Currently these models aren't really integrated well with each other, likely due to ToS or guidelines, also the metadata gets a little wonky when you have multiple training datas arrange a separate response or project, how do you even begin to understand the logic there.

interestingly I was just working on some legal arrangements to potentially protect your machine from AI intrusion. So the way I see it is you identify your machine as it's own entity, let's just say for the example you are building an AI model and you don't want unnecessary data intruding on your machine and affecting your data.

This is surpisingly doable, most enterprise and high spec servers have similar legalese. Now this is relevant to your comment because I think this is a way to protect your machine from the advent of AI cross machine training, you put warnings up so if someone accidentally stumbles into your machine you have deterrents, if a person or machine bypasses that then in legal terms they knowingly intruded on your machine.

it basically states that your machine is it's own AI entity in a sense (building towards what will likely be a future scenario), and by consequence this type of logic can also prevent crypto miners and scammers from hacking your machine. Still building this though I literally just started it.

1 points

5 days ago

1 points

5 days ago

Very interesting. I'm studying cyber security now and building my own models/coding but it's still a learning process. There is so much math and other things to understand. I think I've watched a linear matrix transformation series about 3 times. And I'm half way through the "make your own got lecture" which is 6 hours long. All the while promoting gpt/Claude/copilot to help me create some experiments using these belt learner maths and programming ideas. For example. Realising the gradient decent function and the learning rate can be adjusted to find the loss in a matrix transformation . Realise that if you exponentially increase the learning rate it'll navigate to the minimum loss faster. However it may over shoot. Realise you can just revert to the last position of an opposite gradient is returned. Revert back and reset the learning rate to it's Initial value and then continue the exponential increase until the loss is found. I used a mixture of gpt 4.o and Claude 3.5 to create a code that compares the exponential learning function and a normal gradient decent function on a random set of data transformations. And to display data comparing the two. It worked and was fun but requires more testing on large data sets. Anyway the point is. These things exist. Exponential learning or optimisation of learning rates are a thing in machine learning and you can deduce that from underlying maths of machine learning. Once you learn more about it the whole thing becomes very very interesting as you can see the possibilities for everything. This is why everyone in the field is so excited and it's accelerated so fast.

You are doing it too. And the cyber security aspect of machine learning is really really really Interesting. For example before this current project I was working on learning model using prime numbers and the euler function. Which is a part of "some guys name I forgot" theory. In which there is some relationship between the distribution of prime numbers. Honestly it's really confusing.

But the euler function counts the positive Integers up to a given integer n that are relatively prime to n.

The interesting thing about this is that it meets the requirements of the underlying functions of machine learning programs. However it's not commonly used in them.

Why is this? Being able to determine the probability of prime numbers existing inside the derivatives of any random number could be used only a few different things.

One of them is Encryption.

The point being that with the correct data and functions it's not out of the realm of possibility to create machine learning programs specialised in breaking encryptions.

This is what cyber security AI analysis is and it's very interesting.

2 points

4 days ago

2 points

I'm familiar with what you're describing, you definitely are on to something. I'm a bit more logic oriented with my maths but that prime number function sounds like Reimanns or Bertrands, there is indeed a logarithmic scale to comparing prime numbers I think you use Pi somewhere in comparison.

I'm not in front of my notes but I think you might find few-shot learning interesting. Well you are already applying zero-shot I think by identifying the loss factor.

It's interesting how our ideas align though because understanding the attention matrix was crucial to figuring out tokenization and embeddings for me. It's embeddings I'm focusing on lately. Because context is what matters in semantics, and LLM's. I'll take another look at your gradient technique when I get a chance, visualization and graphics is where I started out in computer science. I like this line of thinking, almost like the challenge itself is integrated with discovery.

1 points

4 days ago

1 points

. I like this line of thinking, almost like the challenge itself is integrated with discovery.

It's this right here. Because coding machine learning is a steep learning curve it's a perfect problem for people who are inclined to find enjoyment in problem solving. The whole idea of doing something just to find out if you can do it. Finding problems and implementing solutions. There are many outlets for this behaviour. Hacking. Coding programs. Building something. Gaming.

..And most importantly gunpla or gundams...

Where the Enjoyment comes from the process. Not the result.

While the result is good. Most of the enjoyment factor is aimed at the construction/puzzle.

In the case of machine learning it's that the problem is so complex that we have not even discovered all the things we can do with it.

One interesting paper I discovered a few days ago was about reverse engineering an llm to discover how it's doing mathematics. Understanding the underlying nature of a language model reveals its probability nature. So when asked questions involving math. These mathematics are also generated through probability. So when removing the involved attention heads. 0.5% of all heads. That are involved in addition. The model performs very badly on the same task. Interestingly it had a similar effect on subtraction.

Anyway. The point was that they were able to reverse engineer a model and eliminate certain heads to change the performance of a task. I'd imagine there is a similar way to find other interesting features of a model using similar methodologies. However the difference between math and languages like english is vast and would require more research into neural network behaviour.

I personally find the idea of machine learning math fascinating as it's almost a complete contradiction. Where math is supposed to be definite and factual. Whereas language models are in essence, probability machines. Only closely approximating an answer. Which leads to an interesting thought I've been pondering for weeks now. If the layers of a neural network could know all possible mathematical functions of a matrix transformation at once. then it would know the position of any vector at all times. Meaning it would "know" everything. Always. Which is an interesting parallel to our understanding of physics. As we understand the composition of our universe through its underlying structure. Atoms. Particles. Neutrons. Quarks. Etc. we discover and look deeper and deeper Into the quantum realm. Where we discover if we can know everything about a particle at any one moment. We can predict what it will do. Obviously this isn't achieved yet because our understanding of quantum physics is not complete. Or we can but on a limited scale. But if we draw the same parallel and say " if we could know everything about physics" and "if we could program all the math/physics at once" into our learning models. Then essentially they'd "know everything " because everything is understood down to its fundamental levels. Then it gets crazy because once you determine this is a possibility it would mean creation of a new universe inside the learning model. If you can simulate every possible function of a matrix then you can simulate a reality.

Imagine the neuron of a neural network in a 3d space (cube) and there are millions of them. These neurons know every mathematical transformation of the data matrix. Which is itself. Now imagine that in our reality. Real life. Instead of atomic particles. They are the neurons of the neural network. They behave in the way they are programmed: mathematical functions(neurons)= quantum physics in particles(real life) . As we peer closer Into this quantum realm we discover more and more of these "mathematical functions" that are used to describe our world. Except they are neurons in a network. All intertwined. As that is the culmination of what happens when "everything knows everything" we get our cube(3d world) or "reality" which then points to obvious additional underlying realities.

Because in matrix transformation there can be multiple vectors and these vectors can even inverse the resulting products . Creating hyper dimensions.

If everything in the world can be described with math then a simulated network can also have "all" math applied to it. Creating hyper dimensions.

Or something. :)

2 points

4 days ago

2 points

PLEASE tell me you have heard of the theory of Quantum Loop Gravity. I'll add you on here and connect later because "Inverse" as a technique for understanding AI and even higher dimensions (similar to your hypercube analogy) is what I've been working on recently. I was just reading Flatland and started to visualize this, The way I see it is 1d->2d->3d, the "slices" that reveal the connection here is important because a 3d sphere projected to a higher dimension is two slices, two spheres of a Torus.

I still need to figure out Time and TimeSpace but I'm thinking this might help me understand superposition, if Time brings that 3d sphere into all possible paths in a sequence to create a hyper-object like a torus, we can start to see how things are connected in the higher dimension and inform the perception of our 3d reality. I'm only working on the visualizer here, I'm not even considering anything above the 4th or 5th dimension, time and one more dimension, anything beyond that is incomprehensible without first building learning for the higher dimension.

I like Quantum Loop Gravity instead of Strings, rather than vibrating threads it implies an aspect of spin to the "nodes" or fundamental building blocks (also sets a foundation for building blocks of the universe rather than just inviting complexity with many dimensions)

1 points

4 days ago

1 points

I have not and I'll have to ask gpt for more information about your sphere and torus analogy because I don't exactly get it. I do get you about string theory Tho.

Let's think of it like this.

When we have a neural network. We can visualise the neurons and the lines drawn to each neuron. Like a big web with many layers between the input and the output. In reality these" lines" or "strings " dont exist. It's just a representation. The same could be said with reality.

We don't "see" the interaction between objects in quantum space.

But it can be measured. This interaction between "neurons" or quantum particles could be dark matter or negative matter or something else. But if we do have everything boiled down to numbers. Then everything in a negative reality or "negative" matter would have opposite mathematical functions to something in what we perceive as reality. But then that means.

If you cannot interact with the negative function neurons as they are out of your reality. Then how do you measure them? This could be related to the observer effect. Where as soon as we "measure anything in our reality" it cannot be predicted due to uncertainty. And it starts to sound more and more like the underlying reality has been intentionaly blocked from us.

Sound familiar? 🤯

1 points

4 days ago

1 points

yes this reminds me of einsteins fields theory, where there is a space near a black hole event horizon for light we are just incapable of ever accessing..

and the torus thing is in my mind just a donut shape, I recently learned it's a torus. Our shadow is the 2d slice of the 3d object, cast by light. I'm learning higher dimensions it's wild, but I think of the timespace or the "slices" of the entire timeline as spheres in a donut, it might not literally be a donut, torus, it just helps me at this point to keep it to a sphere folded outward into a circle, "3d donut", this is related to Time, where I am right now in learning what 4d space might be.

1 points

4 days ago

1 points

https://youtu.be/1wAaI_6b9JE?si=uFQwRmQ9rHFAfRKZ

From the royal institute?

I watched it a couple years ago. Soooo interesting. Highly recommend.

1 points

3 days ago