Support Vector Machines on a one-dimensional array

Currently digging into the Titanic dataset, I've been experimenting with applying an SVM to various individual features using the code snippet below:

quanti_vars = ['Age','Pclass','Fare','Parch']

imp_med = SimpleImputer(missing_values=np.nan, strategy='median')
imp_med.fit(titanic[['Age']])

for i in (X_train, X_test):
    i[['Age']] = imp_med.transform(i[['Age']])

svm_clf = SVC()
svm_clf.fit(X_train[quanti_vars], y_train)
y_pred = svm_clf.predict(X_test[quanti_vars])
svm_accuracy = accuracy_score(y_pred, y_test)
svm_accuracy

for i in quanti_vars:
    svm_clf.fit(X_train[i], y_train)
    y_pred = svm_clf.predict(X_test[i])
    svm_accuracy = accuracy_score(y_pred, y_test)
    print(i,': ',svm_accuracy)

The last for loop leads to a

ValueError: Expected 2D array, got 1D array instead
, leaving me puzzled as to why - isn't it logical for an SVM to process a single feature?

Answer №1

It dawned on me that I simply needed to use double brackets for proper subsetting. Thus, the code snippet below:

for i in quanti_vars:
    svm_clf.fit(X_train[[i]], y_train)
    y_pred = svm_clf.predict(X_test[[i]])
    svm_accuracy = accuracy_score(y_pred, y_test)
    print(i,': ',svm_accuracy)

yields the following output:

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
Age :  0.5874125874125874
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
Pclass :  0.5874125874125874
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
Fare :  0.42657342657342656
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
Parch :  0.6153846153846154

(Admittedly not the most elegant solution, but it gets the job done.)

Answer №2

it's quite simple, just substitute the following line:

y_pred = svm_clf.predict([X_test[i]])

by adding brackets [], it will convert it to a 2D array. It is also advisable to categorize fair amounts into 10 classes rather than using them directly because the distinction between $30 and $50 is significant, but diminishes as the amount increases. For instance, there isn't a substantial difference between $300 and $500.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Create a fresh dictionary by using a for loop to iterate through the keys of an existing dictionary

For my python programming assignment, I have a starting dictionary called aa2mw. This dictionary contains keys representing various amino acids along with their respective molecular weights. Here is what the dictionary looks like: aa2mw = { 'A&ap ...

A streamlined approach to tackling extensive if-elif chains for better performance

Currently, I am referencing a value obtained from an HTML form which corresponds to price ranges. For instance: 0 represents any price, 1 denotes $0 to $100, and so on. I can't help but feel that there must be a more Pythonic or efficient approach to ...

Experimenting with FastAPI's TestClient results in a 422 response code for specific

Hey there, I'm currently facing an issue while testing my code. I am working with FastAPI and pydantic's BaseModel. # Model class Cat(BaseModel): breed: str location_of_origin: str coat_length: int body_type: str pattern: str ...

Encountering a problem when trying to execute a function in Datab

I encountered an issue when calling a currency conversion function I created in Databricks. The error message is as follows: I attempted to resolve the problem with the following code snippet from pyspark.sql.functions import lit from pyspark.sql.function ...

How come the `get_success_url` function always redirects me back to the original page in Django?

As a newcomer to Django, I've been working on a project for a few months and recently decided to restructure it to leverage Django's Models more effectively. The issue I'm facing is that when populating a form with a model-based view (Create ...

Gather all characters following a particular string and combine them into one unified string

I have a unique string that contains specific information: { "getInfoOne": { "sp_name": "analytics.info_one_data", "sp_input_params": { "req_url_query_params": [], "req_body_par ...

What is the best way to identify the state in my gridworld-style environment?

The issue I am attempting to address is more complex than it appears. To simplify the problem, I have created a simple game that serves as a precursor to solving the larger problem. Starting with a 5x5 matrix where all values are initially set to 0: stru ...

Guide to using a boolean operator effectively while evaluating a string within an if-condition

When I need to verify if different parts of a string are in a value, it seems that I must specify the value for each part of the string in order for it to work correctly within an if statement. It appears that the correct method is the one utilized in my ...

Enhancing the django admin group with an additional field

I am seeking guidance on how to add a single field to a group of fields within a fieldset in DjangoAdmin. Currently, I have the following setup: class SecretarioAdmin(UserAdmin): model=Secretario def get_fieldsets(self, request, obj=None): ...

An issue occurred while parsing a JSON file, throwing a JSON decoder error with the message: "

Currently, I am in the process of developing a code to upload a model called train_and_upload_demo_model.py into Solr by using the settings present in the "config.json" file. However, I have encountered an error that states the following: json.decoder.JSON ...

Transform the text into cp500 encoding

I am currently working with a plain text file that contains 500 million rows and is approximately 27GB in size. This file is stored on AWS S3. I have been running the code below for the last 3 hours. I attempted to find encoding methods using PySpark, bu ...

Incomplete roster of kids using Selenium in Python

Does anyone have a solution for retrieving all the children of an element in HTML? Here is the structure of the html: Check out this image of the html structure I've tried the following code: time.sleep(30) stocks = browser.find_element_by_class_na ...

Python: what is the best way to invoke a function with a customizable parameter arrangement?

Consider this scenario. Certain libraries, such as "scipy.integrate," require calling a function like "odeint" (which integrates functions) in the form of "odeint(func, y0, T, ...)" where "func" is the name of a function that must be predefined with two pa ...

Maintain individual dataframe structure following feature selection process on a collection of dataframes

The list of dataframes represented by df contains individual dataframes denoted by y. Post feature selection process, the desired outcome is to preserve selected features from each dataframe as separate outputs - mut_fs and mirna_fs. dfs = [mut, mirna, mrn ...

Ways to create random values near 1 with Numpy

Is there a simple method to create a numpy array containing random numbers that are close to the value of 1? Additionally, is it possible to specify a desired range around the value of 1, such as 1e-5? ...

Packages disappear from view when a venv is included

I added .venv to my python project. After activating it, I installed the requirements with the following commands: python -m venv .ven .\.ven\Scripts\activate pip3 install -r requirements.txt Although running pip3 list shows all packages in ...

Displaying dataframes with Pandas

While attempting to follow the "Machine Learning" tutorial by Microsoft, I encountered an issue during the practical part. I simply copied the code and tried running it from the Linux terminal, but unfortunately, nothing was returned. The video demonstrati ...

The function process_request() was called with 2 arguments but only 1 was provided - A Tutorial on Basic HTTP Authentication Decorator

My current task involves creating a simple decorator that will allow us to apply basic HTTP Authentication to specific django views. This is the decorator I have implemented: def basic_auth(func): def process_request(self, request): if reque ...

Issue encountered when storing and retrieving a pointer within a class using Boost.Python: incorrect data type detected

I am encountering an issue while using Boost.Python 1.54 on Windows with MSVC2010. I am trying to store a pointer to one class in another class from Python and then retrieve it, but it appears that the data type is getting altered somehow. Below are my cl ...

Python woes: Testing functions that process and show user input

I am currently developing a small Python (3.1) application and I have been utilizing doctests throughout the process. However, I have encountered a method that includes an input() function, which is causing some confusion on how to create accurate doctests ...